max

paddle.compat. max ( input: Tensor, *args: Any, out: Tensor | tuple[Tensor, Tensor] | list[Tensor] = None, **kwargs: Any ) Tensor | MinMaxRetType [source]

Computes the maximum of tensor elements. There are mainly 3 cases (functionalities):

  1. paddle.compat.max(input: Tensor): reduce max over all dims, return a single value Tensor

  2. paddle.compat.max(input: Tensor, dim: int (cannot be None), keepdim=False): reduce max over the given dim,

    returns a named tuple MinMaxRetType(values: Tensor, indices: Tensor)

  3. paddle.compat.max(input: Tensor, other: Tensor): see paddle.maximum

Special warning: the gradient behavior is NOT well-documented by PyTorch, the actual behavior should be:

  1. Case 1: the same as max

  2. Case 2: NOT evenly distributing the gradient for equal maximum elements! PyTorch actually only propagates to the elements with indices,

    for example: Tensor([1, 1, 1]) -> max(…, dim=0) -> values=Tensor(0, …), indices=Tensor(0), the gradient for input tensor won’t be Tensor([1/3, 1/3, 1/3]) as stated in their documentation, but will be Tensor([1, 0, 0]). This API implements a similar backward kernel.

  3. Case 3: the same as maximum

Parameters
  • input (Tensor) – A tensor, the data type is bfloat16, float16, float32, float64, int32, int64 on GPU. uint8, int32, int64, float32, float64 are allowed on CPU.

  • dim (int, optional) – The dim along which the maximum is computed. If this is not specified: see case 1, note that: None cannot be passed to this (TypeError will be thrown) compute the maximum over all elements of input and return a Tensor with a single element, otherwise must be in the range \([-input.ndim, input.ndim)\). If \(dim < 0\), the axis to reduce is \(input.ndim + dim\). Warning: if dim is specified, execute static graph will throw exceptions when not on a GPU device, since max_with_index is not implemented for non-GPU devices

  • keepdim (bool, optional) – Whether to reserve the reduced dimension in the output Tensor. The result tensor will have one fewer dimension than the input unless keepdim is true, default value is False. Note that if dim does not appear in neither (*args) or (**kwargs), this parameter cannot be passed alone

  • other (Tensor, optional) – the other tensor to perform paddle.maximum with. This Tensor should have the same or broadcast-able shape as the input. Note that (dim & keepdim) and other are mutually exclusive meaning that trying to composite both will result in TypeError

  • out (Tensor|tuple[Tensor, Tensor], optional) – the output Tensor or tuple of (Tensor, int64 Tensor) that can be optionally given to be used as output buffers. For case 1 and 3 out is just a Tensor, while for case 2 we expect a tuple

Returns

  • For case 1. A single value Tensor (0-dim)

  • For case 2. A named tuple MinMaxRetType(values: Tensor, indices: Tensor), values has the same data type as the input,

    while indices is always an int64 Tensor, with exactly the same shape as values. MinMaxRetType can be used (indexed, packed, unpacked) in the same way as a regular tuple

  • For case 3. See paddle.maximum (maximum)

Examples

>>> import paddle

>>> # data_x is a Tensor with shape [2, 4]
>>> # the axis is a int element
>>> x = paddle.to_tensor([[0.2, 0.3, 0.5, 0.9],
...                       [0.1, 0.2, 0.6, 0.7]],
...                       dtype='float64', stop_gradient=False)
>>> # Case 1: reduce over all dims
>>> result1 = paddle.compat.max(x)
>>> result1
Tensor(shape=[], dtype=float64, place=Place(gpu:0), stop_gradient=False,
0.90000000)

>>> # Case 2: reduce over specified dim
>>> x.clear_grad()
>>> result2 = paddle.compat.max(x, dim=1)
>>> result2
MinMaxRetType(values=Tensor(shape=[2], dtype=float64, place=Place(gpu:0), stop_gradient=False,
    [0.90000000, 0.70000000]), indices=Tensor(shape=[2], dtype=int64, place=Place(gpu:0), stop_gradient=True,
    [3, 3]))
>>> result2[0].backward()
>>> x.grad
Tensor(shape=[2, 4], dtype=float64, place=Place(gpu:0), stop_gradient=False,
    [[0., 0., 0., 1.],
     [0., 0., 0., 1.]])

>>> # Case 3: equivalent to `paddle.maximum`
>>> x.clear_grad()
>>> y = paddle.to_tensor([[0.5, 0.4, 0.1, 0.2],
...                       [0.3, 0.1, 0.6, 0.7]],
...                       dtype='float64', stop_gradient=False)
>>> result3 = paddle.compat.max(x, y)
>>> result3
Tensor(shape=[2, 4], dtype=float64, place=Place(gpu:0), stop_gradient=False,
    [[0.50000000, 0.40000000, 0.50000000, 0.90000000],
     [0.30000000, 0.20000000, 0.60000000, 0.70000000]])