CANDIDE

 Environment Status Notes

Introduction

 

The CANDIDE cluster is hosted and maintained at the Institut d’Astrophysique de Paris by Stephane Rouberol.

CANDIDE Account

To request and account on CANDIDE send an email to Henry Joy McCracken and Stephane Rouberol at IAP with a short description of what you want to do and with whom you work.

SSH

Once you have an account on CANDIDE you can connect via SSH as follows:

$ ping -c 1 -s 999 candide.iap.fr; ssh <mylogin>@candide.iap.fr


Modules

 

The CANDIDE system uses Environment Modules to manage various software packages. You can view the modules currently available on the system by running:

$ module avail

If you need to use conda, on CANDIDE it is provided via intelpython/3. To load the newest version of this package simply run:

$ module load intelpython/3-2020.1

For the installation of astromatic software (SExtractorPSFEx) the BLAS library is required. This is made available by loading the Intel MLK module,

module load intel/19.0

You can add these commands to your .bash_profile to ensure that this module is available when you log in.

You can list the modules already loaded by running:

$ module list

MPI

To compile software with MPI enabled on CANDIDE you also need to load the openmpi module. To do so run:

$ module load openmpi

You can also specify a specific version of OpenMPI to use.

$ module load openmpi/<VERSION>

Then you need to identify the root directory of the OpenMPI installation. A easy way to get this information is by running:

$ module show openmpi

which should reveal something like /softs/openmpi/<VERSION>-torque-CentOS7


Execution

 

CANDIDE uses TORQUE for handling distributed jobs.

TORQUE uses standard Portable Batch System (PBS) commands such as:

  • qsub - To submit jobs to the queue.
  • qstat - To check on the status of jobs in the queue.
  • qdel - To kill jobs in the queue.

Additionally, the availability of compute nodes can be seen using the command

$ cnodes

Jobs should be submitted as bash scripts. e.g.:

$ qsub candide_smp.sh

In this script you can specify:

  • Your email to be notified when your job starts/stops (e.g. #PBS -M <name>@cea.fr)
  • The number of nodes to use (e.g. #PBS -l nodes=10)
  • A specific machine to use with a given number of cores (e.g. #PBS -l nodes=n04:ppn=10)
  • The maximum computing time for your script (e.g. #PBS -l walltime=10:00:00)

Note that the "#PBS" is required to identify PBS commands inside the script.

Example Script

#!/bin/bash
##########################
# Script for CANDIDE     #
##########################
# Receive email when job finishes or aborts
#PBS -M <name>@cea.fr
#PBS -m ea
# Set a name for the job
#PBS -N <my_job_name>
# Join output and errors in one file
#PBS -j oe
# Set maximum computing time (e.g. 5min)
#PBS -l walltime=00:05:00

# Set full path to environment
export MYENV="$HOME/.conda/envs/<MY_CONDA_ENV>"

# Load moudules and activate conda environment
module load intelpython/3
source activate $MYENV

# Run Python script with SMP
$MYENV/bin/python $HOME/<MY_PATH>/<MY_PYTHON_SCRIPT>.py

# Return exit code
exit 0

Example SMP Script

#!/bin/bash
##########################
# SMP Script for CANDIDE #
##########################
# Receive email when job finishes or aborts
#PBS -M <name>@cea.fr
#PBS -m ea
# Set a name for the job
#PBS -N <my_smp_job_name>
# Join output and errors in one file
#PBS -j oe
# Set maximum computing time (e.g. 5min)
#PBS -l walltime=00:05:00
# Request number of cores (e.g. 4)
#PBS -l nodes=4

# Set full path to environment
export MYENV="$HOME/.conda/envs/<MY_CONDA_ENV>"

# Load moudules and activate conda environment
module load intelpython/3
source activate $MYENV

# Run Python script with SMP
$MYENV/bin/python $HOME/<MY_PATH>/<MY_PYTHON_SCRIPT>.py

# Return exit code
exit 0

Example MPI Script

#!/bin/bash
##########################
# MPI Script for CANDIDE #
##########################
# Receive email when job finishes or aborts
#PBS -M <name>@cea.fr
#PBS -m ea
# Set a name for the job
#PBS -N <my_mpi_job_name>
# Join output and errors in one file
#PBS -j oe
# Set maximum computing time (e.g. 5min)
#PBS -l walltime=00:05:00
# Request number of cores (e.g. 4 from 2 different machines)
#PBS -l nodes=2:ppn=2
# Allocate total number of cores to variable NSLOTS
NSLOTS=`cat $PBS_NODEFILE | wc -l`

# Set full path to environment
export MYENV="$HOME/.conda/envs/<MY_CONDA_ENV>"

# Load moudules and activate conda environment
module load intelpython/3
module load openmpi/4.0.2
source activate $MYENV

# Run Python script with MPI
$MYENV/bin/mpiexec -n $NSLOTS $MYENV/bin/python $HOME/<MY_PATH>/<MY_PYTHON_SCRIPT>.py

# Return exit code
exit 0


Troubleshooting