Install Alpa
Requirements
CUDA >= 11.1
CuDNN >= 8.1
python >= 3.7
Install from Source
Alpa depends on its own fork of jax and tensorflow. To install alpa from source, we need to build these forks.
Clone repos
git clone git@github.com:alpa-projects/alpa.git git clone git@github.com:alpa-projects/jax-alpa.git git clone git@github.com:alpa-projects/tensorflow-alpa.git
Install dependencies
Python packages:
pip3 install cmake tqdm pybind11 numba numpy scipy pulp ray flax==0.4.1 tensorstore pip3 install cupy-cuda114 # use your own CUDA version. Here cuda-cuda114 means cuda 11.4.
- NCCL:
First, check whether your system already has NCCL installed.
python3 -c "from cupy.cuda import nccl"
If it prints nothing, then nccl is already installed. Otherwise, follow the printed instructions to install nccl.
Build and install jaxlib
cd jax-alpa export TF_PATH=~/tensorflow-alpa # update this with your path python3 build/build.py --enable_cuda --dev_install --tf_path=$TF_PATH cd dist pip3 install -e .
Install jax
cd jax-alpa pip3 install -e .
Install Alpa
cd alpa pip3 install -e .
Build XLA pipeline marker custom call
cd alpa/alpa/pipeline_parallel/xla_custom_call_marker bash build.sh
Note
All installations are in development mode, so you can modify python code and it will take effect immediately. To modify c++ code in tensorflow, you only need to run the command below from step 3 to recompile jaxlib:
python3 build/build.py --enable_cuda --dev_install --tf_path=$TF_PATH
Check Installation
You can check the installation by running the following test script.
cd alpa
ray start --head
python3 tests/test_install.py