Layers#
- class ivy.data_classes.array.layers._ArrayWithLayers[source]#
Bases:
ABC
- _abc_impl = <_abc._abc_data object>#
- conv1d(filters, strides, padding, /, *, data_format='NWC', filter_format='channel_last', x_dilations=1, dilations=1, bias=None, out=None)[source]#
ivy.Array instance method variant of ivy.conv1d. This method simply wraps the function, and so the docstring for ivy.conv1d also applies to this method with minimal changes.
- Parameters:
self (
Array
) – Input image [batch_size,w,d_in] or [batch_size,d_in,w].filters (
Union
[Array
,NativeArray
]) – Convolution filters [fw,d_in,d_out].strides (
Union
[int
,Tuple
[int
]]) – The stride of the sliding window for each dimension of input.padding (
str
) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.data_format (
str
, default:'NWC'
) – “NWC” or “NCW”. Defaults to “NWC”.filter_format (
str
, default:'channel_last'
) –Either “channel_first” or “channel_last”. Defaults to “channel_last”. x_dilations
The dilation factor for each dimension of input. (Default value = 1)
dilations (
Union
[int
,Tuple
[int
]], default:1
) – The dilation factor for each dimension of input. (Default value = 1)bias (
Optional
[Array
], default:None
) – Bias array of shape [d_out].out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – The result of the convolution operation.
Examples
>>> x = ivy.array([[[1., 2.], [3., 4.], [6., 7.], [9., 11.]]]) # NWC >>> filters = ivy.array([[[0., 1.], [1., 1.]]]) # WIO (I == C) >>> result = x.conv1d(filters, (1,), 'VALID') >>> print(result) ivy.array([[[ 2., 3.], ... [ 4., 7.], ... [ 7., 13.], ... [11., 20.]]])
- conv1d_transpose(filters, strides, padding, /, *, output_shape=None, filter_format='channel_last', data_format='NWC', dilations=1, bias=None, out=None)[source]#
ivy.Array instance method variant of ivy.conv1d_transpose. This method simply wraps the function, and so the docstring for ivy.conv1d_transpose also applies to this method with minimal changes.
- Parameters:
self (
Array
) – Input image [batch_size,w,d_in] or [batch_size,d_in,w].filters (
Union
[Array
,NativeArray
]) – Convolution filters [fw,d_out,d_in].strides (
Union
[int
,Tuple
[int
]]) – The stride of the sliding window for each dimension of input.padding (
str
) – either the string ‘SAME’ (padding with zeros evenly), the string ‘VALID’ (no padding), or a sequence of n (low, high) integer pairs that give the padding to apply before and after each spatial dimension.output_shape (
Optional
[Union
[Shape
,NativeShape
]], default:None
) – Shape of the output (Default value = None)filter_format (
str
, default:'channel_last'
) – Either “channel_first” or “channel_last”. “channel_first” corresponds to “IOW”,input data formats, while “channel_last” corresponds to “WOI”.data_format (
str
, default:'NWC'
) – The ordering of the dimensions in the input, one of “NWC” or “NCW”. “NWC” corresponds to input with shape (batch_size, width, channels), while “NCW” corresponds to input with shape (batch_size, channels, width).dilations (
Union
[int
,Tuple
[int
]], default:1
) – The dilation factor for each dimension of input. (Default value = 1)bias (
Optional
[Array
], default:None
) – Bias array of shape [d_out].out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – The result of the transpose convolution operation.
Examples
>>> x = ivy.array([[[1., 2.], [3., 4.], [6., 7.], [9., 11.]]]) # NWC >>> filters = ivy.array([[[0., 1.], [1., 1.]]]) # WIO (I == C) >>> result = x.conv1d_transpose(filters, (1,), 'VALID') >>> print(result) ivy.array([[[ 2., 3.], ... [ 4., 7.], ... [ 7., 13.], ... [11., 20.]]])
- conv2d(filters, strides, padding, /, *, data_format='NHWC', filter_format='channel_last', x_dilations=1, dilations=1, bias=None, out=None)[source]#
ivy.Array instance method variant of ivy.conv2d. This method simply wraps the function, and so the docstring for ivy.conv2d also applies to this method with minimal changes.
- Parameters:
self (
Array
) – Input image [batch_size,h,w,d_in] or [batch_size,d_in,h,w].filters (
Union
[Array
,NativeArray
]) – Convolution filters [fh,fw,d_in,d_out].strides (
Union
[int
,Tuple
[int
,int
]]) – The stride of the sliding window for each dimension of input.padding (
str
) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.data_format (
str
, default:'NHWC'
) – “NHWC” or “NCHW”. Defaults to “NHWC”.dilations (
Union
[int
,Tuple
[int
,int
]], default:1
) – The dilation factor for each dimension of input. (Default value = 1)filter_format (
str
, default:'channel_last'
) – Either “channel_first” or “channel_last”. Defaults to “channel_last”.x_dilations (
Union
[int
,Tuple
[int
,int
]], default:1
) – The dilation factor for each dimension of input. (Default value = 1)bias (
Optional
[Container
], default:None
) – Bias array of shape [d_out].out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – The result of the convolution operation.
Examples
>>> x = ivy.array([[[[1.], [2.0],[3.]], ... [[1.], [2.0],[3.]], ... [[1.], [2.0],[3.]]]]) #NHWC >>> filters = ivy.array([[[[0.]], [[1.]], [[0.]]], ... [[[0.]], [[1.]], [[0.]]], ... [[[0.]], [[1.]], [[0.]]]]) #HWIO >>> result = x.conv2d(filters, 1, 'SAME', data_format='NHWC', ... dilations= 1) >>> print(result) ivy.array([[ [[2.],[4.],[6.]], [[3.],[6.],[9.]], [[2.],[4.],[6.]] ]])
- conv2d_transpose(filters, strides, padding, /, *, output_shape=None, filter_format='channel_last', data_format='NHWC', dilations=1, out=None, bias=None)[source]#
ivy.Array instance method variant of ivy.conv2d_transpose. This method simply wraps the function, and so the docstring for ivy.conv2d_transpose also applies to this method with minimal changes.
- Parameters:
self (
Array
) – Input image [batch_size,h,w,d_in] or [batch_size,d_in,h,w].filters (
Union
[Array
,NativeArray
]) – Convolution filters [fh,fw,d_out,d_in].strides (
Union
[int
,Tuple
[int
,int
]]) – The stride of the sliding window for each dimension of input.padding (
str
) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.output_shape (
Optional
[Union
[Shape
,NativeShape
]], default:None
) – Shape of the output (Default value = None)filter_format (
str
, default:'channel_last'
) – Either “channel_first” or “channel_last”. “channel_first” corresponds to “IOHW”,input data formats, while “channel_last” corresponds to “HWOI”.data_format (
str
, default:'NHWC'
) – The ordering of the dimensions in the input, one of “NHWC” or “NCHW”. “NHWC” corresponds to inputs with shape (batch_size, height, width, channels), while “NCHW” corresponds to input with shape (batch_size, channels, height, width). Default is"NHWC"
.dilations (
Union
[int
,Tuple
[int
,int
]], default:1
) – The dilation factor for each dimension of input. (Default value = 1)bias (
Optional
[Array
], default:None
) – Bias array of shape [d_out].out (
Optional
[Array
], default:None
) – Optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – The result of the transpose convolution operation.
Examples
>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 28, 28, 3]) >>> filters = ivy.random_normal(mean=0, std=1, shape=[3, 3, 6, 3]) >>> y = x.conv2d_transpose(filters,2,'SAME',) >>> print(y.shape) (1, 56, 56, 6)
- conv3d(filters, strides, padding, /, *, data_format='NDHWC', filter_format='channel_last', x_dilations=1, dilations=1, bias=None, out=None)[source]#
ivy.Array instance method variant of ivy.conv3d. This method simply wraps the function, and so the docstring for ivy.conv3d also applies to this method with minimal changes.
- Parameters:
x – Input volume [batch_size,d,h,w,d_in].
filters (
Union
[Array
,NativeArray
]) – Convolution filters [fd,fh,fw,d_in,d_out].strides (
Union
[int
,Tuple
[int
,int
,int
]]) – The stride of the sliding window for each dimension of input.padding (
str
) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.data_format (
str
, default:'NDHWC'
) – “NDHWC” or “NCDHW”. Defaults to “NDHWC”.filter_format (
str
, default:'channel_last'
) – Either “channel_first” or “channel_last”. Defaults to “channel_last”.x_dilations (
Union
[int
,Tuple
[int
,int
,int
]], default:1
) – The dilation factor for each dimension of input. (Default value = 1)dilations (
Union
[int
,Tuple
[int
,int
,int
]], default:1
) – The dilation factor for each dimension of input. (Default value = 1)bias (
Optional
[Array
], default:None
) – Bias array of shape [d_out].out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – The result of the convolution operation.
Examples
>>> x = ivy.ones((1, 3, 3, 3, 1)).astype(ivy.float32)
>>> filters = ivy.ones((1, 3, 3, 1, 1)).astype(ivy.float32)
>>> result = x.conv3d(filters, 2, 'SAME') >>> print(result) ivy.array([[[[[4.],[4.]],[[4.],[4.]]],[[[4.],[4.]],[[4.],[4.]]]]])
- conv3d_transpose(filters, strides, padding, /, *, output_shape=None, filter_format='channel_last', data_format='NDHWC', dilations=1, bias=None, out=None)[source]#
ivy.Array instance method variant of ivy.conv3d_transpose. This method simply wraps the function, and so the docstring for ivy.conv3d_transpose also applies to this method with minimal changes.
- Parameters:
self (
Array
) – Input volume [batch_size,d,h,w,d_in] or [batch_size,d_in,d,h,w].filters (
Union
[Array
,NativeArray
]) – Convolution filters [fd,fh,fw,d_out,d_in].strides (
Union
[int
,Tuple
[int
],Tuple
[int
,int
],Tuple
[int
,int
,int
]]) – The stride of the sliding window for each dimension of input.padding (
Union
[str
,List
[int
]]) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.output_shape (
Optional
[Union
[Shape
,NativeShape
]], default:None
) – Shape of the output (Default value = None)filter_format (
str
, default:'channel_last'
) – Either “channel_first” or “channel_last”. “channel_first” corresponds to “IODHW”,input data formats, while “channel_last” corresponds to “DHWOI”.data_format (
str
, default:'NDHWC'
) –The ordering of the dimensions in the input, one of “NDHWC” or “NCDHW”. “NDHWC” corresponds to inputs with shape (batch_size,
depth, height, width, channels), while “NCDHW” corresponds to input with shape (batch_size, channels, depth, height, width).
dilations (
Union
[int
,Tuple
[int
],Tuple
[int
,int
],Tuple
[int
,int
,int
]], default:1
) – The dilation factor for each dimension of input. (Default value = 1)bias (
Optional
[Array
], default:None
) – Bias array of shape [d_out].out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – The result of the transpose convolution operation.
Examples
>>> x = ivy.random_normal(mean=0, std=1, shape=[1, 3, 28, 28, 3]) >>> filters = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3, 6, 3]) >>> y = x.conv3d_transpose(filters, 2, 'SAME') >>> print(y.shape) (1, 6, 56, 56, 6)
- depthwise_conv2d(filters, strides, padding, /, *, data_format='NHWC', dilations=1, out=None)[source]#
ivy.Array instance method variant of ivy.depthwise_conv2d. This method simply wraps the function, and so the docstring for ivy.depthwise_conv2d also applies to this method with minimal changes.
- Parameters:
self (
Array
) – Input image [batch_size,h,w,d].filters (
Union
[Array
,NativeArray
]) – Convolution filters [fh,fw,d_in]. (d_in must be the same as d from self)strides (
Union
[int
,Tuple
[int
],Tuple
[int
,int
]]) – The stride of the sliding window for each dimension of input.padding (
Union
[str
,List
[int
]]) – “SAME” or “VALID” indicating the algorithm, or list indicating the per-dimension paddings.data_format (
str
, default:'NHWC'
) – “NHWC” or “NCHW”. Defaults to “NHWC”.dilations (
Union
[int
,Tuple
[int
],Tuple
[int
,int
]], default:1
) – The dilation factor for each dimension of input. (Default value = 1)out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – The result of the convolution operation.
Examples
>>> x = ivy.randint(0, 255, shape=(1, 128, 128, 3)).astype(ivy.float32) / 255.0 >>> filters = ivy.random_normal(mean=0, std=1, shape=[3, 3, 3]) >>> y = x.depthwise_conv2d(filters, 2, 'SAME') >>> print(y.shape) (1, 64, 64, 3)
- dropout(prob, /, *, scale=True, dtype=None, training=True, seed=None, noise_shape=None, out=None)[source]#
ivy.Array instance method variant of ivy.dropout. This method simply wraps the function, and so the docstring for ivy.dropout also applies to this method with minimal changes.
- Parameters:
self (
Array
) – The input array x to perform dropout on.prob (
float
) – The probability of zeroing out each array element, float between 0 and 1.scale (
bool
, default:True
) – Whether to scale the output by 1/(1-prob), default isTrue
.dtype (
Optional
[Union
[Dtype
,NativeDtype
]], default:None
) – output array data type. If dtype is None, the output array data type must be inferred from x. Default:None
.training (
bool
, default:True
) – Turn on dropout if training, turn off otherwise. Default isTrue
.seed (
Optional
[int
], default:None
) – Set a default seed for random number generating (for reproducibility).Default isNone
.noise_shape (
Optional
[Sequence
[int
]], default:None
) – a sequence representing the shape of the binary dropout mask that will be multiplied with the input.out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – Result array of the output after dropout is performed.
Examples
With
ivy.Array
instances:>>> x = ivy.array([[1., 2., 3.], ... [4., 5., 6.], ... [7., 8., 9.], ... [10., 11., 12.]]) >>> y = x.dropout(0.3) >>> print(y) ivy.array([[ 1.42857146, 2.85714293, 4.28571415], [ 5.71428585, 7.14285755, 8.5714283 ], [ 0. , 11.4285717 , 12.8571434 ], [14.2857151 , 0. , 0. ]])
>>> x = ivy.array([[1., 2., 3.], ... [4., 5., 6.], ... [7., 8., 9.], ... [10., 11., 12.]]) >>> y = x.dropout(0.3, scale=False) >>> print(y) ivy.array([[ 1., 2., 3.], [ 4., 5., 0.], [ 7., 0., 9.], [10., 11., 0.]])
- dropout1d(prob, /, *, training=True, data_format='NWC', out=None)[source]#
ivy.Array instance method variant of ivy.dropout1d. This method simply wraps the function, and so the docstring for ivy.dropout1d also applies to this method with minimal changes.
- Parameters:
self (
Array
) – The input array x to perform dropout on.prob (
float
) – The probability of zeroing out each array element, float between 0 and 1.training (
bool
, default:True
) – Turn on dropout if training, turn off otherwise. Default isTrue
.data_format (
str
, default:'NWC'
) – “NWC” or “NCW”. Default is"NWC"
.out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – Result array of the output after dropout is performed.
Examples
>>> x = ivy.array([1, 1, 1]).reshape([1, 1, 3]) >>> y = x.dropout1d(0.5) >>> print(y) ivy.array([[[2., 0, 2.]]])
- dropout2d(prob, /, *, training=True, data_format='NHWC', out=None)[source]#
ivy.Array instance method variant of ivy.dropout2d. This method simply wraps the function, and so the docstring for ivy.dropout1d also applies to this method with minimal changes.
- Parameters:
self (
Array
) – The input array x to perform dropout on.prob (
float
) – The probability of zeroing out each array element, float between 0 and 1.training (
bool
, default:True
) – Turn on dropout if training, turn off otherwise. Default isTrue
.data_format (
str
, default:'NHWC'
) – “NHWC” or “NCHW”. Default is"NHWC"
.out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – Result array of the output after dropout is performed.
Examples
>>> x = ivy.array([[1, 1, 1], [2, 2, 2]]) >>> y = x.dropout2d(0.5) >>> print(y) ivy.array([[0., 0., 2.], [4., 4., 4.]])
- dropout3d(prob, /, *, training=True, data_format='NDHWC', out=None)[source]#
ivy.Array instance method variant of ivy.dropout3d. This method simply wraps the function, and so the docstring for ivy.dropout3d also applies to this method with minimal changes.
- Parameters:
self (
Array
) – The input array x to perform dropout on.prob (
float
) – The probability of zeroing out each array element, float between 0 and 1.training (
bool
, default:True
) – Turn on dropout if training, turn off otherwise. Default isTrue
.data_format (
str
, default:'NDHWC'
) – “NDHWC” or “NCDHW”. Default is"NDHWC"
.out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – Result array of the output after dropout is performed.
- linear(weight, /, *, bias=None, out=None)[source]#
ivy.Array instance method variant of ivy.linear. This method simply wraps the function, and so the docstring for ivy.linear also applies to this method with minimal changes.
- Parameters:
self (
Array
) – The input array to compute linear transformation on. [outer_batch_shape,inner_batch_shape,in_features]weight (
Union
[Array
,NativeArray
]) – The weight matrix. [outer_batch_shape,out_features,in_features]bias (
Optional
[Union
[Array
,NativeArray
]], default:None
) – The bias vector, default isNone
. [outer_batch_shape,out_features]out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – Result array of the linear transformation. [outer_batch_shape,inner_batch_shape,out_features]
Examples
>>> x = ivy.array([[1.1, 2.2, 3.3], [4.4, 5.5, 6.6], [7.7, 8.8, 9.9]]) >>> w = ivy.array([[1., 2., 3.], [4., 5., 6.], [7., 8., 9.]]) >>> b = ivy.array([1., 0., -1.]) >>> y = x.linear(w, bias=b) >>> print(y) ivy.array([[ 16.4, 35.2, 54. ], [ 36.2, 84.7, 133. ], [ 56. , 134. , 212. ]])
- lstm_update(init_h, init_c, kernel, recurrent_kernel, /, *, bias=None, recurrent_bias=None)[source]#
ivy.Array instance method variant of ivy.lstm_update. This method simply wraps the function, and so the docstring for ivy.lstm_update also applies to this method with minimal changes.
- Parameters:
init_h (
Union
[Array
,NativeArray
]) – initial state tensor for the cell output [batch_shape, out].init_c (
Union
[Array
,NativeArray
]) – initial state tensor for the cell hidden state [batch_shape, out].kernel (
Union
[Array
,NativeArray
]) – weights for cell kernel [in, 4 x out].recurrent_kernel (
Union
[Array
,NativeArray
]) – weights for cell recurrent kernel [out, 4 x out].bias (
Optional
[Union
[Array
,NativeArray
]], default:None
) – bias for cell kernel [4 x out]. (Default value = None)recurrent_bias (
Optional
[Union
[Array
,NativeArray
]], default:None
) – bias for cell recurrent kernel [4 x out]. (Default value = None)
- Return type:
Tuple
[Array
,Array
]- Returns:
ret – hidden state for all timesteps [batch_shape,t,out] and cell state for last timestep [batch_shape,out]
Examples
>>> x = ivy.randint(0, 20, shape=(6, 20, 3)) >>> h_i = ivy.random_normal(shape=(6, 5)) >>> c_i = ivy.random_normal(shape=(6, 5)) >>> kernel = ivy.random_normal(shape=(3, 4 * 5)) >>> rc = ivy.random_normal(shape=(5, 4 * 5)) >>> result = x.lstm_update(h_i, c_i, kernel, rc)
>>> result[0].shape (6, 20, 5) >>> result[1].shape (6, 5)
- multi_head_attention(*, key=None, value=None, num_heads=8, scale=None, attention_mask=None, in_proj_weights=None, q_proj_weights=None, k_proj_weights=None, v_proj_weights=None, out_proj_weights=None, in_proj_bias=None, out_proj_bias=None, is_causal=False, key_padding_mask=None, bias_k=None, bias_v=None, static_k=None, static_v=None, add_zero_attn=False, return_attention_weights=False, average_attention_weights=True, dropout=0.0, training=False, out=None)[source]#
- Return type:
Array
- scaled_dot_product_attention(key, value, /, *, scale=None, mask=None, dropout_p=0.0, is_causal=False, training=False, out=None)[source]#
ivy.Array instance method variant of ivy.scaled_dot_product_attention. This method simply wraps the function, and so the docstring for ivy.scaled_dot_product_attention also applies to this method with minimal changes.
- Parameters:
self (
Array
) – The queries input array. The shape of queries input array should be in [batch_shape,num_queries,feat_dim]. The queries input array should have the same size as keys and values.key (
Union
[Array
,NativeArray
]) – The keys input array. The shape of keys input array should be in [batch_shape,num_keys,feat_dim]. The keys input array should have the same size as queries and values.value (
Union
[Array
,NativeArray
]) – The values input array. The shape of values input should be in [batch_shape,num_keys,feat_dim]. The values input array should have the same size as queries and keys.scale (
Optional
[float
], default:None
) – The scale float value. The scale float value is used to scale the query-key pairs before softmax.mask (
Optional
[Union
[Array
,NativeArray
]], default:None
) – The mask input array. The mask to apply to the query-key values. Default is None. The shape of mask input should be in [batch_shape,num_queries,num_keys].dropout_p (
Optional
[float
], default:0.0
) – Specifies the dropout probability, if greater than 0.0, dropout is appliedis_causal (
Optional
[bool
], default:False
) – If true, assumes causal attention masking and errors if both mask and is_causal are set.training (
Optional
[bool
], default:False
) – If True, dropout is used, otherwise dropout is not activated.out (
Optional
[Array
], default:None
) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.
- Return type:
Array
- Returns:
ret – The output following application of scaled dot-product attention. The output array is the weighted sum produced by the attention score and value. The shape of output array is [batch_shape,num_queries,feat_dim] .
Examples
With
ivy.Array
input:>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]]) >>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]]) >>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]]) >>> result = ivy.scaled_dot_product_attention(q, k, v, scale=1, dropout_p=0.1, ... is_causal=True, training=True) >>> print(result) ivy.array([[[0.40000001, 1.29999995], [2.19994521, 3.09994531], [4.30000019, 5.30000019]]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.],[4.4, 5.6]]]) >>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3],[4.2, 5.1]]]) >>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1],[4.3, 5.3]]]) >>> mask = ivy.array([[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0],[0.0, 0.0, 0.0]]]) >>> result = ivy.scaled_dot_product_attention(q,k,v,scale=1, mask=mask) >>> print(result) ivy.array([[[0.40000001, 1.29999995], [2.19994521, 3.09994531], [4.30000019, 5.30000019]]])
>>> q = ivy.array([[[0.2, 1.], [2.2, 3.], [4.4, 5.6]]]) >>> k = ivy.array([[[0.6, 1.5], [2.4, 3.3], [4.2, 5.1]]]) >>> v = ivy.array([[[0.4, 1.3], [2.2, 3.1], [4.3, 5.3]]]) >>> out = ivy.zeros(shape=(1, 3, 2)) >>> ivy.scaled_dot_product_attention(q, k, v, scale=1, dropout_p=0.1, ... is_causal=True, training=True, out=out) >>> print(out) ivy.array([[[0.40000001, 1.29999995], [2.19994521, 3.09994531], [4.30000019, 5.30000019]]])
This should have hopefully given you an overview of the layers submodule, if you have any questions, please feel free to reach out on our discord!