Using Bioconda
Many of our users are performing computations in the Bioinformatics field. Python has a huge number of packages for this in the Bioconda channel. A “channel” is a repository for a collection of packages. So first you will need to install this channel.
First see: https://bioconda.github.io You should read this as it covers what Bioconda is and how to cite Bioconda in your publications. Also see: https://hpc.research.uts.edu.au/faq/acknowledgement/.
Then have a browse of the package index here https://bioconda.github.io/conda-package_index.html to get an overview of what is available.
Remember our help for Miniconda is here: https://hpc.research.uts.edu.au/software_general/python/python_miniconda/ You might need to refer to that for some of the steps below.
Install the Bioconda Channel¶
The instructions below are just following: https://bioconda.github.io/user/install.html#set-up-channels
Activate your Miniconda environment. Then add the channnels in this order.
(base) $ conda config --add channels defaults
(base) $ conda config --add channels bioconda
(base) $ conda config --add channels conda-forge
Searching for the Package you Need¶
(base)$ conda search cd-hit
Loading channels: done
# Name Version Build Channel
cd-hit 4.6.4 0 bioconda
cd-hit 4.6.4 1 bioconda
cd-hit 4.6.6 0 bioconda
cd-hit 4.6.8 0 bioconda
cd-hit 4.6.8 hfc679d8_2 bioconda
cd-hit 4.8.1 h2e03b76_4 bioconda
cd-hit 4.8.1 h2e03b76_5 bioconda
cd-hit 4.8.1 h5b5514e_6 bioconda
cd-hit 4.8.1 h8b12597_3 bioconda
cd-hit 4.8.1 hdbcaa40_0 bioconda
cd-hit 4.8.1 hdbcaa40_1 bioconda
cd-hit 4.8.1 hdbcaa40_2 bioconda
(base)$
Installing your Required Package¶
Remember do not install your computational packages in the base environment. Install them in a virtual environment. Keep that base environment pristine.
Here I list my Python virtual environments:
(base) hpcnode01 playbooks/$ conda env list
# conda environments:
base * /shared/homes/XXX/miniconda3
bio-projects /shared/homes/XXX/miniconda3/envs/bio-projects
geo-stuff /shared/homes/XXX/miniconda3/envs/geo-stuff
physics /shared/homes/XXX/miniconda3/envs/physics
I’ll install cd-hit version 4.6.8 into my bio-projects environment:
$ conda install --name bio-projects cd-hit=4.6.8
(base)$ conda install -n bio-projects cd-hit=4.6.8
The following NEW packages will be INSTALLED:
cd-hit bioconda/linux-64::cd-hit-4.6.8-hfc679d8_2
etc ....
Using your New Package¶
Just active that environment:
(base)$ conda activate bio-projects
(bio-projects)$
Use the package:
(bio-projects)$ cd-hit -h
CD-HIT version 4.7 (built on Jul 13 2018)
Usage: cd-hit [Options]
Options
-i input filename in fasta format, required
-o output filename, required
-c sequence identity threshold, default 0.9
this is the default cd-hit's "global sequence identity" calculated as:
number of identical amino acids in alignment
etc...
When your finished you can deactivate that environment and your base Miniconda environment:
(bio-projects)$ conda deactivate
(base)$
(base)$ conda deactivate
$