create_nccl_config

paddle.distributed. create_nccl_config ( nccl_config: dict[str, int | str] | None = None ) → NCCLConfig | None [source]

Function that creates nccl config.

Parameters: nccl_config (dict[str, int | str] | None) – None or a dict containing the following keys: commName (str): name of the process group. ll_buffsize (int): buffer size of ll protocol. ll128_buffsize (int): buffer size of ll128 protocol. simple_buffsize (int): buffer size of simple protocol. buffsize_align (int): alignment unit of the total buffer size. nchannels (int): max number of channels. algoStr (str): communication algorithm. protoStr (str): communication protocol.
Returns: an object containing the information, which can be used as an argument of new_group().
Return type: NCCLConfig (NCCLConfig | None)

Examples

           >>> 
>>> import paddle
>>> import paddle.distributed as dist
>>> from typing import Union
>>> dist.init_parallel_env()
>>> nccl_config: dict[str, Union[int, str]] = {"commName":"tp_comm","ll_buffsize":0,"ll128_buffsize":0,"simple_buffsize":1024,"buffsize_align":1024,"nchannels":4,"algoStr":"Ring","protoStr":"Simple",}
>>> ranks=[0,1,2,3,4,5,6,7]
>>> nccl_config=dist.create_nccl_config(nccl_config)
>>> pg=dist.new_group(ranks, nccl_config=nccl_config)
>>> m, n = 4096, 8192
>>> local_rank = dist.get_rank(pg)
>>> num_local_ranks = dist.get_world_size(pg)
>>> x = paddle.ones(shape=[m, n], dtype=paddle.float32) * (local_rank + 1)
>>> dist.all_reduce(x, group=pg)