Appendix
Nvidia GPU architecture and installation mode supported by PaddlePaddle
| GPU | Compute Capability | Corresponding GPU hardware model | Please download the following CUDA version of PaddlePaddle installation package |
|---|---|---|---|
| Pascal | sm_60 | Quadro GP100, Tesla P100, DGX-1 | CUDA10、CUDA11 |
| Pascal | sm_61 | GTX 1080, GTX 1070, GTX 1060, GTX 1050, GTX 1030 (GP108), GT 1010 (GP108) Titan Xp, Tesla P40, Tesla P4 | CUDA10、CUDA11 |
| Volta | sm_70 | DGX-1 with Volta, Tesla V100, GTX 1180 (GV104), Titan V, Quadro GV100 | CUDA10、CUDA11 |
| Turing | sm_75 | GTX/RTX Turing – GTX 1660 Ti, RTX 2060, RTX 2070, RTX 2080, Titan RTX, Quadro RTX 4000, Quadro RTX 5000, Quadro RTX 6000, Quadro RTX 8000, Quadro T1000/T2000, Tesla T4 | CUDA10、CUDA11 |
| Ampere | sm_80 | NVIDIA A100, GA100, NVIDIA DGX-A100 | CUDA11.8、CUDA12.x(Recommend) |
| Ampere | sm_86 | Tesla GA10x cards, RTX Ampere – RTX 3080, GA102 – RTX 3090, RTX A2000, A3000, RTX A4000, A5000, A6000, NVIDIA A40, GA106 – RTX 3060, GA104 – RTX 3070, GA107 – RTX 3050, RTX A10, RTX A16, RTX A40, A2 Tensor Core GPU | CUDA11.8、CUDA12.x (Recommend) |
| Hopper | sm_90 | NVIDIA H100, H800 | CUDA12.6 CUDA12.9 (Recommend) CUDA13.0 |
| Blackwell | sm_100 | NVIDIA B100, B200, GB200, NVIDIA DGX-B200 | CUDA12.9(Recommend) CUDA13.0 |
| Blackwell | sm_120 | NVIDIA RTX 5090, RTX 5080, RTX 5070 | CUDA12.9(Recommend) CUDA13.0 |
Compile Dependency Table
| Dependency package name | Version | Description | Installation command |
|---|---|---|---|
| CMake | 3.18, 3.19(Recommend), 4.0 | ||
| GCC | 8.2 / 12.2 | ||
| Clang (macOS Only) | 9.0 and above | Usually use the clang version of macOS 10.11 and above | |
| Python(64 bit) | 3.9 3.10 3.11 3.12 3.13 | depends on libpython3.9+.so | please go to Python official website |
| SWIG | at least 2.0 | apt install swig or yum install swig |
|
| wget | any | apt install wget or yum install wget |
|
| openblas | any | optional | |
| pip | at least 20.2.2 | apt install python-pip or yum install Python-pip |
|
| numpy | >=1.21.0 | pip install numpy |
|
| httpx | pip install httpx |
||
| Pillow | pip install Pillow |
||
| networkx | pip install networkx |
||
| typing_extensions | pip install typing_extensions |
||
| safetensors | pip install safetensors >=0.6.0 |
||
| opt_einsum | pip install opt_einsum==3.3.0 |
||
| protobuf | >=3.20.2 | pip install protobuf |
|
| patchELF | any | apt install patchelf or read github patchELF official documentation |
|
| go | >=1.8 | optional | |
| setuptools | Required in Python 3.12 and above | ||
| unrar | brew install unrar (For macOS), apt-get install unrar (For Ubuntu) |
Compile Option Table
BLAS
PaddlePaddle supports two BLAS libraries, MKL and OpenBlAS. MKL is used by default. If you use MKL and the machine contains the AVX2 instruction set, you will also download the MKL-DNN math library, for details please refer to here.
If you close MKL, OpenBLAS will be used as the BLAS library.
CUDA/cuDNN
PaddlePaddle automatically finds the CUDA and cuDNN libraries installed in the system for compilation and execution at compile time/runtime. Use the parameter -DCUDA_ARCH_NAME=Auto to specify to enable automatic detection of the SM architecture and speed up compilation.
PaddlePaddle can be compiled and run using any version after cuDNN v5.1, but try to keep the same version of cuDNN in the compiling and running processes. We recommend using the latest version of cuDNN.
Configure Compile Options
PaddePaddle implements references to various BLAS/CUDA/cuDNN libraries by specifying paths at compile time. When cmake compiles, it first searches the system paths ( /usr/liby and /usr/local/lib ) for these libraries, and also reads the relevant path variables for searching. Can be set by using the -D command, for example:
Cmake .. -DWITH_GPU=ON -DWITH_TESTING=OFF -DCUDNN_ROOT=/opt/cudnnv5
Note: The settings introduced here for these compilation options are only valid for the first cmake. If you want to reset it later, it is recommended to clean up the entire build directory ( rm -rf ) and then specify it.
Installation Package List
| Option | Description | Default |
|---|---|---|
| WITH_GPU | Whether to support GPU | ON |
| WITH_AVX | whether to compile PaddlePaddle binaries file containing the AVX instruction set | ON |
| WITH_PYTHON | Whether the PYTHON interpreter is embedded | ON |
| WITH_TESTING | Whether to turn on unit test | OFF |
| WITH_MKL | Whether to use the MKL math library, if not,using OpenBLAS | ON |
| WITH_SYSTEM_BLAS | Whether to use the system's BLAS | OFF |
| WITH_DISTRIBUTE | Whether to Compile with distributed version | OFF |
| WITH_BRPC_RDMA | Whether to use BRPC RDMA as RPC protocol | OFF |
| ON_INFER | Whether to turn on prediction optimization | OFF |
| CUDA_ARCH_NAME | Compile only for current CUDA schema or not | All:Compile all supported CUDA architectures optional: Auto automatically recognizes the schema compilation of the current environment |
| TENSORRT_ROOT | Specify TensorRT path | The default value under windows is '/', The default value under windows is '/usr/' |
| Version Number | Release Description |
|---|---|
| paddlepaddle==[version code] such as paddlepaddle==3.3.0 | Only support the corresponding version of the CPU PaddlePaddle, please refer to Pypi for the specific version. |
| paddlepaddle-gpu==[version code], such as paddlepaddle-gpu==3.3.0 | For specific installation methods and versions, please refer tohere. |
You can find and download the corresponding PaddlePaddle-gpu release for your CUDA environment at the following official PaddlePaddle path:
Multi-version whl package list - Release
Table instruction
Vertical axis
cpu-mkl: Support CPU training and prediction, use Intel MKL math library
cuda10_cudnn7-mkl: Support GPU training and prediction, use Intel MKL math library
Transverse axis
Generally, it is similar to "cp310-cp310", in which:
310:python tag, refers to python3.10. Similarly, there are "39", "310", "311", "312", "313", etc
mu:refers to unicode version python, if it is m, refers to non Unicode version Python
Installation package naming rules
Each installation package has a unique name. They are named according to the official rules of Python. The form is as follows:
{distribution}-{version}(-{build tag})?-{python tag}-{abi tag}-{platform tag}.whl
The build tag can be missing, and other parts cannot be missing
distribution: wheel name
version: Version, for example 0.14.0 (must be in numeric format)
python tag: similar to 'py39', 'py310', 'py311', 'py312', 'py313', used to indicate the corresponding Python version
abi tag: similar to 'cp33m', 'abi3', 'none'
platform tag: similar to 'linux_x86_64', 'any'
Execute the PaddlePaddle training program in Docker
Suppose you have written a PaddlePaddle program in the current directory (such as /home/work): train.py ( refer to PaddlePaddleBook to write), you can start the training with the following command:
cd /home/work
docker run -it -v $PWD:/work ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.3.0-gpu-cuda12.9-cudnn9.9 /work/train.py
In the above commands, the -it parameter indicates that the container has been run interactively; -v $PWD:/work specifies that the current path (the absolute path where the PWD variable in Linux will expand to the current path) is mounted to the :/work directory inside the container: ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle specifies the container to be used; finally /work/train.py is the command executed inside the container, ie. the training program.
Of course, you can also enter into the Docker container and execute or debug your code interactively:
docker run -it -v $PWD:/work ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.3.0-gpu-cuda12.9-cudnn9.9 /bin/bash
cd /work
python train.py
**Note: In order to reduce the size, vim is not installed in PaddlePaddle Docker image by default. You can edit the code in the container after executing ** apt-get install -y vim (which installs vim for you) in the container.
Perform GPU training using Docker
In order to ensure that the GPU driver works properly in the image, we recommend using nvidia-docker to run the image. Don't forget to install the latest GPU drivers on your physical machine in advance.For specific driver version requirements, please refer to here
Nvidia-docker run -it -v $PWD:/work ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.3.0-gpu-cuda12.9-cudnn9.9 /bin/bash
Note: If you don't have nvidia-docker installed, you can try the following to mount the CUDA library and Linux devices into the Docker container:
export CUDA_SO="$(\ls /usr/lib64/libcuda* | xargs -I{} echo '-v {}:{}') \
$(\ls /usr/lib64/libnvidia* | xargs -I{} echo '-v {}:{}')"
export DEVICES=$(\ls /dev/nvidia* | xargs -I{} echo '--device {}:{}')
docker run ${CUDA_SO} \
${DEVICES} -it ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:latest-gpu
