Table of Contents

Practical Protocols for Probing Protists

Unix, Perun and where to find softwares and databases

Using UNIX

Using Perun

RAM and CPU memory - Perun Queues

GPU nodes and how to work with them

Perun and Environments

Perun and Custom Environments

GitHub repositories of the lab

Other software locations

Databases locations

BLAST+ and v5 database user guide

BLAST and PLAST

Taxonomy recovery from BLAST/PLAST output

Bioconda

Making Bioconda recipes

Tips and Tricks

The idea of this section is to share command line tips and tricks that you have found made it a lot easier to do bioinformatics work. This can range from basic UNIX things like setting up your terminal environment, configuring your SSH connections or managing your filesystem to explaining new bioinformatics tools and how to work with large scale computer clusters.

Recordings of the Tips and Tricks meetings are available for downloading or viewing online here

Or directly on Perun, on /scratch3/downloads/icgvideo/TAT

The Terminal

Setting up your terminal environment

Command line hotkeys

Command line utilities

ls expanded

wc expanded

head and tail expanded

sort expanded

uniq expanded

cut expanded

cat expanded

nohup

grep expanded

awk for tabulated files

SSH

SSH Keys

Passwords and Passphrases

The SSH Config

Common Bioinformatics Operations

Viewing tabular BLAST outfiles

Viewing alignment files

Handy custom functions

Manipulating FASTA FILES

RNA-Seq Processing

DNA Sequence Data Processing

[Short reads] Cleaning of Illumina Paired-end reads

[Short reads] Short read assembly

[Short reads] depth and breadth of coverage

[Long reads] MINION Sequencing from START to FINISH

[Long reads] Assembling Long Read Data

[Long reads] Nanopore tools for polishing

Decontamination using a metagenomics approach (Anvi’o)

Decontamination using read classifier (Eukfinder)

Decontaminating reads using DECONSEQ

[Genome] Binning tools

[Genome] Evaluating genome completeness using Benchmarking Universal Single-Copy Orthologs (BUSCO)

[Genome] Evaluating genome quality using CGAL

[Mito genome] Using Mitoprot

[Mito genome] Visualizing mitochondrial genomes

Changing contig or scaffold names in a genome assembly

Extracting a single fasta entry (or multiple) from a multifasta file

Getting protein or CDS sequences from a gff file

RNA sequence data, gene expression

de novo transcriptome assembly with Trinity

de novo transcriptome assembly with rnaSPAdes

[RNA-seq reads] Mapping RNAseq data to your genome

[Transcriptome] Evaluating and comparing transcriptome assemblies with rnaQUAST

[Transcriptome] TransDecoder for transcriptomes

Differential gene expression analysis

Gene Prediction

[Ab initio] Gene prediction with just GeneMark

[Ab initio] Gene prediction with just Augustus

[RNAseq-informed] Gene prediction with find_supported_orfs.py

[Pipeline] Gene prediction with Braker2

[Pipeline] Gene prediction with the Funannotate

[Fix] Improving gene models with fix_genes_with_false_introns.py

[Manual Curation] Gene prediction curation with IGV

From Nanopore to Gene Prediction

Functional annotation

Functional annotation with the Funannotate pipeline

Finding genes of interest with PANTHER HMMs

Search Protocol for Orthologs of Components of Key molecular systems (SPOCK)...\\ SORRY this is not up and running yet. Please complain with Dayana!

Orthologs searches using panther HMMRs

Dayana Salas - Utility scripts(taxonomy, coloring trees, phylogenetics, mixture models, domain architecture and more)

Protein structure prediction

Running AlphaFold at scale

Phylogenetic analyses

[Phylogeny] Multi-gene phylogeny pipeline

Distinguishing LGT from contaminants

Running RevBayes with multiple cores

Phylogenomic analyses

Curation of phylogenomic datasets

Misc. analyses

Ploidy analysis using ploidyNGS

Programming languages

Python Resources

Bioinformatics Q&A

[Q&A] Real cases of bash/shell scripts「Ongoing」

Q&A others

Bioinformatics and Trivia

Do Bioinformaticians have naming regrets?

A little of what you fancy does you good!

Playground

Dokuwiki maintenance

Who looks after which page

Project Documentation

Blastocystis orf160 This could possibly be an interesting project for a future undergraduate student