LayerNorm

class paddle.nn. LayerNorm ( normalized_shape: int | Sequence[int], epsilon: float = 1e-05, *, elementwise_affine: bool = True, bias: bool = True, device: PlaceLike | None = None, dtype: DTypeLike | None = None, weight_attr: bool | ParamAttr | None = None, bias_attr: bool | ParamAttr | None = None, name: str | None = None ) [source]

Construct a callable object of the LayerNorm class. For more details, refer to code examples. It implements the function of the Layer Normalization Layer and can be applied to mini-batch input data. Refer to Layer Normalization

The formula is as follows:

\[ \begin{align}\begin{aligned}\mu & = \frac{1}{H}\sum_{i=1}^{H} x_i\\\sigma & = \sqrt{\frac{1}{H}\sum_{i=1}^{H}{(x_i - \mu)^2} + \epsilon}\\y & = f(\frac{g}{\sigma}(x - \mu) + b)\end{aligned}\end{align} \]
  • \(x\): the vector representation of the summed inputs to the neurons in that layer.

  • \(H\): the number of hidden units in a layers

  • \(\epsilon\): the small value added to the variance to prevent division by zero.

  • \(g\): the trainable scale parameter.

  • \(b\): the trainable bias parameter.

Parameters
  • normalized_shape (int|list|tuple) – Input shape from an expected input of size [*, normalized_shape[0], normalized_shape[1], ..., normalized_shape[-1]] . If it is a single integer, this module will normalize over the last dimension which is expected to be of that specific size.

  • epsilon (float, optional) – The small value added to the variance to prevent division by zero. Default: 1e-05. alias: eps.

  • elementwise_affine (bool, optional) – Whether to apply element-wise affine transformation (i.e., learnable scale and bias). If set to False, both the scale (\(g\)) and bias (\(b\)) parameters will be disabled, regardless of the settings of weight_attr and bias_attr. This parameter acts as a master switch. Defaults to True. Note: This argument must be passed as a keyword argument.

  • bias (bool, optional) – Whether to include a learnable bias term in the layer. This setting only takes effect when elementwise_affine is True. If set to False, no bias parameter will be created, even if bias_attr is specified. Defaults to True. Note: This argument must be passed as a keyword argument.

  • weight_attr (ParamAttr|bool|None, optional) –

    The parameter attribute for the learnable gain \(g\) (scale). This setting only takes effect when elementwise_affine is True. - If set to False, no gain parameter will be created. - If set to None or True, a default ParamAttr will be used, and the

    System Message: ERROR/3 (/usr/local/lib/python3.10/site-packages/paddle/nn/layer/norm.py:docstring of paddle.nn.layer.norm.LayerNorm, line 46)

    Unexpected indentation.

    parameter will be initialized to 1.

    System Message: WARNING/2 (/usr/local/lib/python3.10/site-packages/paddle/nn/layer/norm.py:docstring of paddle.nn.layer.norm.LayerNorm, line 41)

    Block quote ends without a blank line; unexpected unindent.

    • If set to a custom ParamAttr object, it will be used to configure the parameter.

    System Message: WARNING/2 (/usr/local/lib/python3.10/site-packages/paddle/nn/layer/norm.py:docstring of paddle.nn.layer.norm.LayerNorm, line 42)

    Bullet list ends without a blank line; unexpected unindent.

    Default: None. Note: This argument must be passed as a keyword argument.

  • bias_attr (ParamAttr|bool|None, optional) –

    The parameter attribute for the learnable bias \(b\). This setting only takes effect when both elementwise_affine and bias are True. - If set to False, no bias parameter will be created. - If set to None or True, a default ParamAttr will be used, and the

    System Message: ERROR/3 (/usr/local/lib/python3.10/site-packages/paddle/nn/layer/norm.py:docstring of paddle.nn.layer.norm.LayerNorm, line 55)

    Unexpected indentation.

    parameter will be initialized to 0.

    System Message: WARNING/2 (/usr/local/lib/python3.10/site-packages/paddle/nn/layer/norm.py:docstring of paddle.nn.layer.norm.LayerNorm, line 50)

    Block quote ends without a blank line; unexpected unindent.

    • If set to a custom ParamAttr object, it will be used to configure the parameter.

    System Message: WARNING/2 (/usr/local/lib/python3.10/site-packages/paddle/nn/layer/norm.py:docstring of paddle.nn.layer.norm.LayerNorm, line 51)

    Bullet list ends without a blank line; unexpected unindent.

    Default: None. Note: This argument must be passed as a keyword argument.

  • name (str|None, optional) – Name for the LayerNorm, default is None. For more information, please refer to api_guide_Name . Note: This argument must be passed as a keyword argument.

Shape:
  • x: 2-D, 3-D, 4-D or 5-D tensor.

  • output: same shape as input x.

Returns

Tensor , the dimension is the same as x, but the internal values have been normalized by LayerNorm .

Examples

>>> import paddle
>>> paddle.seed(100)
>>> x = paddle.rand((2, 2, 2, 3))
>>> layer_norm = paddle.nn.LayerNorm(x.shape[1:])
>>> layer_norm_out = layer_norm(x)

>>> print(layer_norm_out)
Tensor(shape=[2, 2, 2, 3], dtype=float32, place=Place(cpu), stop_gradient=False,
[[[[ 0.60520101, -0.67670590, -1.40020895],
   [ 0.46540466, -0.09736638, -0.47771254]],
  [[-0.74365306,  0.63718957, -1.41333175],
   [ 1.44764745, -0.25489068,  1.90842617]]],
 [[[ 1.09773350,  1.49568415, -0.45503747],
   [-1.01755989,  1.08368254, -0.38671425]],
  [[-0.62252408,  0.60490781,  0.13109133],
   [-0.81222653,  0.84285998, -1.96189952]]]])
forward ( input: Tensor ) Tensor

forward

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments

extra_repr ( ) str

extra_repr

Extra representation of this layer, you can have custom implementation of your own layer.