Cylc
What is Cylc¶
Cylc is a general purpose workflow engine that can also orchestrate cycling systems very efficiently. It is used in production weather, climate, and environmental forecasting on HPC, but is not specialised to those domains.
Using a workflow engine may enable you to run large parametric or sensitivity studies while ensuring scalability, reproducibility and flexibility. If you have embarrassingly parallel jobs then Cylc might be a good solution for you. The workflow engine will allow for the concurrent execution of parallel jobs, depending on the task graph and available resources on the platform. One advantage of this approach over running a monolithic, parallel executable is that each task will require less resources that the complete problem, it is thus easier for each task to slip into the queue and start running.
See the NeSI Snakemake page for another, possible choice.
In this article, we show how you can create a simple workflow and run it on NeSI's platform. Consult the Cylc documentation for more elaborate examples, including some with a cycling (repeated) graph pattern. One of the strengths of Cylc is that simple workflows can be executed simply while allowing for very complex workflows, with thousands of tasks, which may be repeated ad infinitum.
SSH configuration¶
Cylc uses ssh to automatically start schedulers on configured "run hosts" and to submit jobs to remote platforms, so if you haven't already done so you need to allow ssh to connect to other hosts on the HPC cluster without prompting for a passphrase (all HPC nodes see the same filesystem, so this is easy):
- run
ssh-keygento generate a public/private key pair with no passphrase (when it asks for a passphrase, just hit enter) - add your own public key to your authorized keys
file:
cat .ssh/id_rsa.pub >> .ssh/authorized_keys - check that your keys, authorized_keys file, ssh
directory, and home directory all have sufficiently secure
file permissions. If not,
sshwill silently revert to requiring password entry. See for example https://www.frankindev.com/2020/11/26/permissions-for-.ssh-folder-and-key-files/ - make sure your home directory has a maximum of 750 permissions
Now you should be able to run ssh login02(for example) without
being asked for a passphrase.
How to install Cylc¶
Create a new conda environment and install Cylc with the commands:
module purge
module load Miniforge3
export CYLC_HOME=/nesi/project/nesi99999/$USER/environment/cylc-env
conda create --prefix $CYLC_HOME python=3.12
CYLC_HOME points to the installation path of your choice. Adjust nesi99999 to fit your project number. It is not recommended to install in your home directory.
Then type
conda init
Now activate the new environment
module purge && module load Miniforge3
conda activate /nesi/project/nesi99999/$USER/environment/cylc-env # or whereever you installed cylc-env
conda install -c conda-forge cylc-flow
cylc was successfully installed
cylc --version
8.6.0
Create a cylc wrapper script¶
In order to allow cylc to be invoked through SLURM the following steps are required
mkdir $CYLC_HOME/wrapper
cylc get-resources cylc $CYLC_HOME/wrapper
chmod +x $CYLC_HOME/wrapper/cylc
cp $CYLC_HOME/wrapper/cylc $CYLC_HOME/wrapper/cylc.bak
sed -i "s|CYLC_HOME_ROOT=\"\${CYLC_HOME_ROOT:-/opt}\"|CYLC_HOME_ROOT=\"\${CYLC_HOME_ROOT:-${CYLC_HOME}}\"|" $CYLC_HOME/wrapper/cylc
Make sure the wrapper script is found when opening a new shell
export CYLC_HOME=/nesi/project/nesi99999/$USER/environment/cylc-env # adapt
export PATH=$CYLC_HOME/wrapper:$PATH
Setting up Cylc on Mahuika¶
In order to allow Cylc to submit tasks to the SLURM scheduler, we need to configure platforms. Edit (or create) your global configuration file:
mkdir -p ~/.cylc/flow/
vim ~/.cylc/flow/global.cylc
[platforms]
[[mahuika-slurm]]
hosts = login01, login02, login03
install target = localhost
job runner = slurm
Test cylc¶
git clone https://github.com/pletzer/cylc_patterns
cd cylc_patterns/cylc-src/slurm/
cylc clean slurm
cylc vip .
This will submit a job to SLURM
squeue --me
JOBID USER ACCOUNT NAME CPUS MIN_MEM PARTITI START_TIME TIME_LEFT STATE NODELIST(REASON)
2974409 pletzera nesi99999 a.1.slurm/ru 1 512M milan,g N/A 1:00 PENDING (Priority)
cylc tui slurm