Using Alpa on Slurm
This page provides instructions to run Alpa on clusters managed by Slurm. This guide is modified from Deploy Ray on Slurm.
Prerequisites
Alpa requires CUDA and runs on GPUs. Therefore, you need access to CUDA-compatible GPUs in the cluster to run Alpa.
If you have a virtual environment with required dependencies for Alpa installed, please jump to the section Create sbatch Script.
If you don’t have one already, please follow the steps below to create a Python virtual environment for Alpa.
Create Environment for Alpa
Here we give a sample workflow to create an environment for Alpa and verfy the environment as well.
We assume that there is Python environment management systems available on the Slurm cluster, e.g. conda.
The following steps can make sure you set up the environment needed by Alpa correctly. To illustrate this process we use conda, but the workflow also applies to other Python virtual environment management systems.
Enter Interactive Mode
The first step is to create an interactive mode on Slurm so that you can run commands and check results in real-time.
interact -p GPU-shared -N 1 --gres=gpu:v100-16:1 -t 10:00Note
This command starts an interactive job on the partition of GPU-shared (
-p GPU-shared
), on one GPU node (-N 1
) with one v100-16GB GPU assigned (--gres=gpu:v100-16:1
) for a time of 10 minutes (-t 10:00
).The minimum command to start an interactive job is by running
interact
directly. This will ask the cluster to assign all resources by default. As we want to setup and test the environment, we need GPU to test and hence we need to specify all these options.The name of the partition, the option for assigning GPU (in some clusters,
--gpus=
is used instead of--gres=
), and the name for GPU depend on the cluster you work on. Please use these values specified by your cluster.
Then the cluster will try to assign the resources you asked for.
Once you got these resources, the command interact
returns and you are now in an interactive session. Your commandline will show that you are now at the node of the cluster like user@v001:~/$
where v001
is the name of the node you are assigned to.
Load Required Software
In some clusters, containers of common tools are packed and can be loaded through module load <target>
.
For example, to run Alpa, we need CUDA, so we run the following to get CUDA:
module load cudaNote
Some clusters have multiple versions of CUDA for you to use, you can check all versions using:
module avail cudaThen you can specify a version to be used. Like Alpa requires CUDA >= 11.1.
module load cuda/11.1Please refer to the manager of the cluster to install a specific versions of CUDA or other softwares if not available.
Similarly, we can load CuDNN and NVCC:
module load cudnn module load nvhpcNote
Note here
nvhpc
is loaded as it containsnvcc
in the cluster.nvcc
might need to be loaded differently in other clusters. Please check with your cluster on how to loadnvcc
.A common issue with CuDNN on clusters is the version provided by the cluster is older than the version required by Alpa (>= 8.0.5). To solve this, one can install a compatible version of CuDNN inside the Python virtual environment like the installation of other dependencies covered in Install and Check Dependencies.
Install and Check Dependencies
Before you install dependencies of Alpa, create a virtual environment with the version of Python you use:
conda create -n alpa_environment python=3.9Note
Please make sure the Python version meets the requirement by Alpa of >= 3.7.
Then enter this Python virtual environment:
conda activate alpa_environment
Then your commandline should show as (alpa_environment)user@v001:~/$
.
Note
Check the environment is entered by checking the version of python:
python3 --version
Then please follow the installation guide of Alpa in Install Alpa.
All the commands mentioned in the installation guide works for you and can make sure the environment alpa_environment
works with Alpa.
Exit Virtual Environment
Once you have finished installation and testing, exit the environment:
conda deactivate
Next time you want to activate this environment, use the following command:
conda activate alpa_environment
Exit Interactive Session
To exit interactive session, press Ctrl+D
.
Create sbatch
Script
Usually large jobs like Alpa is run through sbatch on Slurm using a sbatch
script. sbatch
scripts are bash scripts with sbatch
options specified using the syntax of #SBATCH <options>
.
The Slurm cluster takes sbatch scripts submitted using command sbatch`
then queues the job specified by the script for execution.
When Slurm executes the script, the script works exactly the same as a shell script.
Note
The shell script commands are run on each of the nodes assigned for your job. To specify running a command at one node, use command
srun
’s option of--nodes=1
. Available options forsrun
can be found in SRUN.srun
is to run a job for execution in real time whilesbatch
allows you to submit a job for later execution without blocking.srun
is also compatible with the script we create.
A sbatch
script to run Alpa can be roughly summarized as four parts: resources setup, load dependencies, Ray startup, and run Alpa.
The first step is to create a sbatch
script in your directory, usually named as a .sh
file.
Here, this guide asumes the script is included in a file run_alpa_on_slurm.sh
.
Just like a shell script, the sbatch
script starts with a line specifying the path to interpreter:
#!/bin/bash
Resources Setup
The first lines in your sbatch script is used to specify resources for the job like all the options you specify when running srun or sbatch. These usually includes the name of the job, partition the job should go to, CPU per task, memory per CPU, number of nodes, number of tasks per node, and time limit for the job.
#SBATCH --job-name=alpa_multinode_test #SBATCH --partition=GPU #SBATCH --nodes=2 #SBATCH --tasks-per-node=1 #SBATCH --cpus-per-task=1 #SBATCH --mem-per-cpu=1GB #SBATCH --gpus-per-node=v100-16:8 #SBATCH --time=00:10:00Note
Setting the resources needed in the sbatch script is equivalent to setting them submitting the job to Slurm running
sbatch <options> <sbatch script>
.Here,
--tasks-per-node=1 --cpus-per-task=1
are specified to allow Ray (which uses CPU) to run on the nodes.The option :node:`--gpus-per-node=v100-16:8` is specified with GPU type and number. Please refer to your cluster on how to set this field.
Load Dependencies
The next step is to setup the environment with Alpa’s dependencies installed.
In some Slurm clusters, CUDA, NVCC, and CuDNN are packed in containers that can be loaded directly. Here, we provide an example that loads a combination of available container and user-defined environment from package management systems.
To load directly from available containers, use module load <module>
:
module load cuda module load cudnn module load nvhpcNote
To check available softwares, run:
module avail module avail cudaWhen there is no required software, please ask manager of the cluster to install.
When multiple versions available, one can specify the version to be used:
module load cuda/11.1.1 module load cudnn/8.0.5You can check the module needed is used with:
module list cuda
The load of pre-installed software can be different in different clusters, please use the way your cluster uses.
To activate an environment using package management systems like conda, add the following line:
conda activate alpa_environment
In summary, this step adds a chunk in the script like below:
# load containers module purge # optional module load cuda module load cudnn module load nvhpc # activate conda environment conda activate alpa_environment
After this step, all the dependencies, including packages and softwares needed for Alpa is loaded and can be used.
Running within one node in the cluster, you can jump to use Single Node Script.
Ray Startup
Then it’s time for Ray to run. The first step is to grab the nodes assigned in the cluster to this job and name one node to be head node in the topology of a Ray cluster:
# Get names of nodes assigned nodes=$(scontrol show hostnames "$SLURM_JOB_NODELIST") nodes_array=($nodes) # By default, set the first node to be head_node on which we run HEAD of Ray head_node=${nodes_array[0]} head_node_ip=$(srun --nodes=1 --ntasks=1 -w "$head_node" hostname --ip-address)Note
In the case of a cluster uses ipv6 addresses, one can use the following script after we get head node ip to change it to ipv4:
# Convert ipv6 address to ipv4 address if [[ "$head_node_ip" == *" "* ]]; then IFS=' ' read -ra ADDR <<<"$head_node_ip" if [[ ${#ADDR[0]} -gt 16 ]]; then head_node_ip=${ADDR[1]} else head_node_ip=${ADDR[0]} fi echo "Found IPV6 address, split the IPV4 address as $head_node_ip" fi
After we have a head node, we spawn HEAD on head node:
# Setup port and variables needed gpus_per_node=8 port=6789 ip_head=$head_node_ip:$port export ip_head # Start HEAD in background of head node srun --nodes=1 --ntasks=1 -w "$head_node" \ ray start --head --node-ip-address="$head_node_ip" --port=$port \ --num-cpus "${SLURM_CPUS_PER_TASK}" --num-gpus $gpus_per_node --block &Note
Note here the argument
gpus_per_node
should not exceed the number of GPU you have on each node.
Then we spawn worker nodes on other nodes we have and connect them to HEAD:
# Start worker nodes # Number of nodes other than the head node worker_num=$((SLURM_JOB_NUM_NODES - 1)) # Iterate on each node other than head node, start ray worker and connect to HEAD for ((i = 1; i <= worker_num; i++)); do node_i=${nodes_array[$i]} echo "Starting WORKER $i at $node_i" srun --nodes=1 --ntasks=1 -w "$node_i" \ ray start --address "$ip_head" --num-cpus "${SLURM_CPUS_PER_TASK}" \ --num-gpus $gpus_per_node --block & sleep 5 done
[Optional] Check Ray is Running
You can check if Ray is started and all nodes connected by adding this line:
ray list nodes
In the output of the job, you are expected to see the same number of nodes you asked for listed by this command.
At this time, we have a running Ray instance and next we can run Alpa on it.
Run Alpa
Just like running Alpa locally, the previous steps are equivalent of having run ray with ray start --head
.
Then it’s time to run Alpa:
python3 -m alpa.test_install
Submit Job
To submit the job, run the following command:
sbatch run_alpa_on_slurm.shNote
After you submit the job, Slurm will tell you the job’s number. You can check your submitted job’s status using command squeue. To find all jobs you have, run:
squeue -u <your_user_name>To check all jobs running and queued in a partition, run:
squeue -p <partition_name>When you no longer see a job in the list, it means the job is finished.
Note
Another option is to run with
srun
. This will give an interactive session, meaning the output will show up in your terminal whilesbatch
collects output to a file.Run with
srun
is exactly the same assbatch
:srun run_alpa_on_slurm.sh
Check Output
After a Slurm job is finished, the output will appear in your directory as a file (if you submitted the job through sbatch
).
On some Slurm clusters, the output file is named slurm-<job_number>.out
.
You can check the file for output the same way you read a text file.
Sample sbatch`
Scripts
Multi-node Script
Putting things together, a sample sbatch script that runs Alpa is as follows:
#!/bin/bash #SBATCH --job-name=alpa_multinode_test #SBATCH --partition=GPU #SBATCH --nodes=2 #SBATCH --tasks-per-node=1 #SBATCH --cpus-per-task=1 #SBATCH --mem-per-cpu=1GB #SBATCH --gpus-per-node=v100-16:8 #SBATCH --time=00:10:00 # load containers module purge module load cuda module load cudnn module load nvhpc # activate conda environment conda activate alpa_environment # Get names of nodes assigned nodes=$(scontrol show hostnames "$SLURM_JOB_NODELIST") nodes_array=($nodes) # By default, set the first node to be head_node on which we run HEAD of Ray head_node=${nodes_array[0]} head_node_ip=$(srun --nodes=1 --ntasks=1 -w "$head_node" hostname --ip-address) # Setup port and variables needed gpus_per_node=8 port=6789 ip_head=$head_node_ip:$port export ip_head # Start HEAD in background of head node srun --nodes=1 --ntasks=1 -w "$head_node" \ ray start --head --node-ip-address="$head_node_ip" --port=$port \ --num-cpus "${SLURM_CPUS_PER_TASK}" --num-gpus $gpus_per_node --block & # Optional, sometimes needed for old Ray versions sleep 10 # Start worker nodes # Number of nodes other than the head node worker_num=$((SLURM_JOB_NUM_NODES - 1)) # Iterate on each node other than head node, start ray worker and connect to HEAD for ((i = 1; i <= worker_num; i++)); do node_i=${nodes_array[$i]} echo "Starting WORKER $i at $node_i" srun --nodes=1 --ntasks=1 -w "$node_i" \ ray start --address "$ip_head" --num-cpus "${SLURM_CPUS_PER_TASK}" \ --num-gpus $gpus_per_node --block & sleep 5 done # Run Alpa test python3 -m alpa.test_install # Optional. Slurm will terminate all processes automatically ray stop conda deactivate exit
Single Node Script
For running Alpa on Slurm with only one node or shared node, the following script can be used:
#!/bin/bash #SBATCH --job-name=alpa_uninode_test #SBATCH -p GPU-shared #SBATCH -N 1 #SBATCH --gpus=v100-16:1 #SBATCH -t 0:10:00 # load containers module purge module load cuda module load cudnn module load nvhpc # activate conda environment conda activate alpa_environment # Start Ray on HEAD ray start --head # Run Alpa test python3 -m alpa.test_install # Optional. Slurm will terminate all processes automatically ray stop conda deactivate exit