- Loading and initializing Conda
- Installing and activating a Conda environment
- Running a Conda installed application
- Warning: multithreading and MPI applications
- Tips and Troubleshooting
- What is a Conda environment anyway???
Conda
Conda is an open source package management system.
External Links
Conda website
Conda: User Guide
Contents
Loading and Initializing Conda
Before using Conda, the following two steps are required. Conda must be initialized, and a .condarc must be created. The default shell for the new Hazel cluster is bash (however the tcsh shell is also available, type "tcsh").
If your home directory is at or over-quota from an improperly configured conda, see this page for help.
1) Initialize Conda environment:This step is necessary only once for an HPC user, unless the initialization settings are removed.
To load the system installed Conda, load the module and use init to add it to the path. Normally using a login file to automatically set the environment is strongly discouraged, but in the case of Conda, many features cannot be used without setting this initialization file. Log out and then back in again after using conda init.
module load conda conda init bash [ignore the warnings, log out then back in again]Optional - remove old environments:
For users who have already been using different Conda environments and would like to begin installing with the new recommended procedures, clean out the remnants of old Conda environments by doing the following.
Check for these 'dot' files:cd ~ more .bashrc more .tcshrcIf these files contain information for old Conda environments, edit the files and delete this section:
# >>> conda initialize >>> (stuff) # >>> conda initialize >>>
2) Create a .condarc file:
This step is mandatory: Conda will fill the quota of the home directory if pkgs_dirs is not set in this file.
By default, Conda stores packages in the /home directory. The /home directory is too small for that, and the packages are only needed temporarily. They should not be saved, taking up space in permanent directories. To change the default location to /share, use a text editor to create a file called .condarc. The path to that file should be /home/$USER/.condarc and it should contain the path to the alternative location, e.g.:
pkgs_dirs: - /share/$GROUP/$USER/conda/pkgs
In addition, many packages require adding a 'channel'. Common channels may be added before creating environments by editing the .condarc.
- To add channels, add them to ~/.condarc. For example, the bioconda and conda-forge channels may be added by adding the following lines to ~/.condarc:
channels: - bioconda - conda-forge
The following displays a sample .condarc file:
[unityid@login01 ~]$ cd [unityid@login01 ~]$ more .condarc pkgs_dirs: - /share/group_name/unityid/conda/pkgs channels: - bioconda - conda-forge
Installing and activating a Conda environment
Before installing any software, including a Conda environment, request a space for user maintained software to be used by all members of a Project. The path for that space is generally /usr/local/usrapps/groupname.
If conda is failing to install your packages, then delete the cache folder at /share/$GROUP/$USER/conda/pkgs. You may want to double check you have not manually put any important files in this directory before deleting it.
The following is the general idea of how to use Conda. Please use a YAML file to avoid package conflicts. See below for details.
To install Conda environments, specify a prefix, which will be the path to where the environment will be installed. Choose a descriptive name for the environment - Conda will create the directory, the directory should not already exist. For example, to create a Conda environment called env_ABC containing the packages AAA, BBB, and CCC, and install it in the directory /usr/local/usrapps/[groupname][username], do :
conda create --prefix /usr/local/usrapps/$GROUP/$USER/env_ABC AAA BBB CCC
To activate the environment, do:
conda activate /usr/local/usrapps/$GROUP/$USER/env_ABC
Once in a Conda environment, a user can install additional packages using either conda install or pip install (after doing conda install pip); however, this is not recommended, as it is harder when doing so to maintain an environment where all software is compatible.
Best practice is to create a YAML file with all of the desired Conda packages. Conda will 'solve' the environment, that is, it will find a configuration where all desired packages are the correct version numbers to work together, assuming such a configuration exists. If a user needs a version of one software that is not compatible with another, then they would create two different Conda environments. When updating, create a new environment and do not delete the old version without thoroughly testing the new one.
To create a Conda environment from a YAML file called ABC.yml, do
conda env create --prefix /usr/local/usrapps/$GROUP/$USER/env_ABC -f ABC.ymlThe YAML file will contain a name, a list of Conda channels to look for the packages, and a list of all the desired packages.
- Here are some sample YAML files:
- datascience.yml - Contains many common data science programs
- sklearn.yml - Machine learning with scikit-learn
- biotools.yml - Contains applications for a bioinformatics workflow
- ncdfutil.yml - Used in sponsored software group ncdfutil, contains many NetCDF Utilities
- rlibs.yml - Used to create an environment with custom R libraries
When Conda creates an environment, it finds a configuration such that all of the packages/dependencies are compatible. If a great many packages are added to a YAML file, it might be impossible for Conda to resolve the necessary environment. In that case, multiple Conda environments will need to be created.
To deactivate the environment, do:conda deactivate
Running a Conda installed application
Activating a Conda environment sets the compute environment, and is similar to loading a module. Here is a sample batch script that uses an application called mycode that was installed via a Conda environment:
#!/bin/bash #SBATCH --ntasks=1 #SBATCH --time=02:00:00 #SBATCH --job-name=mycode #SBATCH --output=stdout.%j #SBATCH --error=stderr.%j source ~/.bashrc conda activate /usr/local/usrapps/groupname/username/env_mycode mycode conda deactivateNOTE: The "source ~/.bashrc" in the bash job script is necessary. You may run into issues if you don't have it before the "conda activate ......" command.
Warning: multithreading and MPI applications
Tips and Troubleshooting
conda list.conda install on an existing environment may break that environment, resulting in your scripts suddenly not working anymore.conda env create --prefix /path/to/env_ABC -f ABC.yml.- To add Jupyter notebook to a Conda environment, add 'notebook' to the dependencies in the YAML file:
- notebook
- (1) Log in to Hazel.
- (2) Request an interactive node: As you may recall from earlier, a login node CANNOT BE USED TO RUN CODE. This means that we need to request (and log in to) a compute node. You can do this by typing:.
salloc --ntasks=4 --nodes=1 --time=08:00:00
-
This code requests an interactive job (salloc) with four cores (--ntasks=4) on the same host (--nodes=1) and a time limit of 8 hours (--time=08:00:00), with a bash terminal. You can increase the number of cores, or reduce the time limit, or even specify a queue.
In response to the request, the terminal will log you in to a compute node:
Job <387980> is submitted to default queue single_chassis. Waiting for dispatch ... Starting on nxxx (base) [username@nxxx your-directory]$NOTE THE NAME OF THE NODE. In the above example it is nxxx. This is the remote host where the Jupyter Notebook instance will run.
- (3) Activate your conda environment that contains Jupyter Notebook installed. For example
conda activate /share/$GROUP/$USER/group2_env
- (4) Launch your notebook by typing the following:
jupyter notebook --no-browser --port=1113 --ip=0.0.0.0Choose a random value for the port (in this case, we chose 1113, but it can be any four digits). However, remember what port you chose, as this will be our "host port" later on.
The terminal will pause for a minute, then show a series of lines of text, ending with the link and "token" to open the app
- (5) Copy the URL provided at the end of the page (for example,
http://nxxx:1113/?token=ae7f7ea974d4d80f94c247e0a4bda5c3acb5408583e33a38).
Don't paste it anywhere yet, we first need to set up tunneling to our local machine for this to work.
- (6) Open a new local terminal on WSL, PowerShell or MacOS Terminal. In this new terminal, type the following:
ssh -N -L LOCALPORT:HOSTNAME:HOSTPORT UNITYID@login.hpc.ncsu.eduLOCALPORT can be any four digit number, just remember what you picked. For this example, I will choose 9999.
HOSTNAME is the name of the node that Jupyter is running on. In my case, this is nxxx.
HOSTPORT is the port that Jupyter is running on the host node, which I specified when I ran the jupyter notebook code (in my case, 1113).
UNITYID is the same ID I use to log in to Hazel.
So for the above example, I would type:
ssh -N -L 9999:nxxx:1113 username@login.hpc.ncsu.edu
- (7) You will get prompted for your password again. Once you verify, this second terminal window will become unresponsive. This is fine, and it means that tunnelling is now set up, and you're ready to open jupyter.
- (8) Take the link from the first window (for example, http://nxxx:1113/?token=ae7f7ea974d4d80f94c247e0a4bda5c3acb5408583e33a38), and replace the part between http:// and /? with "localhost" and the port number that you chose when you set up tunneling (LOCALPORT at point 6 above). In our case it would be:
http://localhost:9999/?token=ae7f7ea974d4d80f94c247e0a4bda5c3acb5408583e33a38. Paste the link on google chrome or safari and it will open the jupyter notebook.
Here you can create a new jupyter notebook file and run python code.
- (9) When you're done and want to exit, you can shut down jupyter notebook by clicking on "Quit" on the homepage, or going back to the terminal window where jupyter notebook is running and hitting Ctrl+C (and then "yes").
You can also shut down the compute node by typing exit in the terminal.
The second terminal window (the one that we set up tunneling through using ssh -N -L …) can be closed hitting Ctrl+C as well.
NOTE: if you want to verify that Jupyter Notebook is running on a compute node, you can type in a chunk:
import platform platform.node()This should return the name of the compute node that we logged into with salloc.
What is a Conda environment anyway???
Last modified: March 14 2026 09:24:12.