Conda is an open source package management system.
External Links
Conda website
Conda: User Guide
Contents
Before using Conda, the following two steps are required. Conda must be initialized, and a .condarc must be created. The default shell for the new Hazel cluster is bash (however the tcsh shell is also available, type "tcsh").
1) Initialize Conda environment:This step is necessary only once for an HPC user, unless the initialization settings are removed.
To load the system installed Conda, load the module and use init to add it to the path. Normally using a login file to automatically set the environment is strongly discouraged, but in the case of Conda, many features cannot be used without setting this initialization file. Log out and then back in again after using conda init.
module load conda conda init bash [ignore the warnings, log out then back in again]Optional - remove old environments:
For users who have already been using different Conda environments and would like to begin installing with the new recommended procedures, clean out the remnants of old Conda environments by doing the following.
Check for these 'dot' files:cd ~ more .bashrc more .tcshrcIf these files contain information for old Conda environments, edit the files and delete this section:
# >>> conda initialize >>> (stuff) # >>> conda initialize >>>
2) Create a .condarc file:
This step is mandatory: Conda will fill the quota of the home directory if pkgs_dirs is not set in this file.
By default, Conda stores packages in the /home directory. The /home directory is too small for that, and the packages are only needed temporarily. They should not be saved, taking up space in permanent directories. To change the default location to /share, use a text editor to create a file called .condarc. The path to that file should be /home/$USER/.condarc and it should contain the path to the alternative location, e.g.:
pkgs_dirs: - /share/$GROUP/$USER/conda/pkgs
In addition, many packages require adding a 'channel'. Common channels may be added before creating environments by editing the .condarc.
channels: - bioconda - conda-forge
The following displays a sample .condarc file:
[unityid@login01 ~]$ cd [unityid@login01 ~]$ more .condarc pkgs_dirs: - /share/group_name/unityid/conda/pkgs channels: - bioconda - conda-forge - defaults
Before installing any software, including a Conda environment, request a space for user maintained software to be used by all members of a Project. The path for that space is generally /usr/local/usrapps/groupname.
The following is the general idea of how to use Conda. Please use a YAML file to avoid package conflicts. See below for details.
To install Conda environments, specify a prefix, which will be the path to where the environment will be installed. Choose a descriptive name for the environment - Conda will create the directory, the directory should not already exist. For example, to create a Conda environment called env_ABC containing the packages AAA, BBB, and CCC, and install it in the directory /usr/local/usrapps/[groupname][username], do :
conda create --prefix /usr/local/usrapps/$GROUP/$USER/env_ABC AAA BBB CCC
To activate the environment, do:
conda activate /usr/local/usrapps/$GROUP/$USER/env_ABC
Once in a Conda environment, a user can install additional packages using either conda install or pip install (after doing conda install pip); however, this is not recommended, as it is harder when doing so to maintain an environment where all software is compatible.
Best practice is to create a YAML file with all of the desired Conda packages. Conda will 'solve' the environment, that is, it will find a configuration where all desired packages are the correct version numbers to work together, assuming such a configuration exists. If a user needs a version of one software that is not compatible with another, then they would create two different Conda environments. When updating, create a new environment and do not delete the old version without thoroughly testing the new one.
To create a Conda environment from a YAML file called ABC.yml, do
conda env create --prefix /usr/local/usrapps/$GROUP/$USER/env_ABC -f ABC.ymlThe YAML file will contain a name, a list of Conda channels to look for the packages, and a list of all the desired packages.
When Conda creates an environment, it finds a configuration such that all of the packages/dependencies are compatible. If a great many packages are added to a YAML file, it might be impossible for Conda to resolve the necessary environment. In that case, multiple Conda environments will need to be created.
To deactivate the environment, do:conda deactivate
Activating a Conda environment sets the compute environment, and is similar to loading a module. Here is a sample batch script that uses an application called mycode that was installed via a Conda environment:
#!/bin/tcsh #BSUB -n 1 #BSUB -W 120 #BSUB -J mycode #BSUB -o stdout.%J #BSUB -e stderr.%J conda activate /usr/local/usrapps/groupname/username/env_mycode mycode conda deactivate
conda list
.conda install
on an existing environment may break that environment, resulting in your scripts suddenly not working anymore.conda env create --prefix /path/to/env_ABC -f ABC.yml
.- notebook
Last modified: April 20 2023 13:18:23.