Gradients#

class ivy.data_classes.array.gradients._ArrayWithGradients[source]#

Bases: ABC

_abc_impl = <_abc._abc_data object>#

adam_step(mw, vw, step, /, *, beta1=0.9, beta2=0.999, epsilon=1e-07, out=None)[source]#

ivy.Array instance method variant of ivy.adam_step. This method simply wraps the function, and so the docstring for ivy.adam_step also applies to this method with minimal changes.

Parameters:

self (Array) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].
mw (Union[Array, NativeArray]) – running average of the gradients.
vw (Union[Array, NativeArray]) – running average of second moments of the gradients.
step (Union[int, float]) – training step.
beta1 (float, default: 0.9) – gradient forgetting factor (Default value = 0.9).
beta2 (float, default: 0.999) – second moment of gradient forgetting factor (Default value = 0.999).
epsilon (float, default: 1e-07) – divisor during adam update, preventing division by zero (Default value = 1e-7).
out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The adam step delta.

Examples

With ivy.Array inputs:

>>> dcdw = ivy.array([1, 2, 3])
>>> mw = ivy.ones(3)
>>> vw = ivy.ones(1)
>>> step = ivy.array(3)
>>> adam_step_delta = dcdw.adam_step(mw, vw, step)
>>> print(adam_step_delta)
(ivy.array([0.2020105,0.22187898,0.24144873]),
    ivy.array([1.,1.10000002,1.20000005]),
    ivy.array([1.,1.00300002,1.00800002]))

adam_update(dcdw, lr, mw_tm1, vw_tm1, step, /, *, beta1=0.9, beta2=0.999, epsilon=1e-07, stop_gradients=True, out=None)[source]#

ivy.Array instance method variant of ivy.adam_update. This method simply wraps the function, and so the docstring for ivy.adam_update also applies to this method with minimal changes.

Parameters:

self (Array) – Weights of the function to be updated.
dcdw (Union[Array, NativeArray]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].
lr (Union[float, Array, NativeArray]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.
mw_tm1 (Union[Array, NativeArray]) – running average of the gradients, from the previous time-step.
vw_tm1 (Union[Array, NativeArray]) – running average of second moments of the gradients, from the previous time-step.
step (int) – training step.
beta1 (float, default: 0.9) – gradient forgetting factor (Default value = 0.9).
beta2 (float, default: 0.999) – second moment of gradient forgetting factor (Default value = 0.999).
epsilon (float, default: 1e-07) – divisor during adam update, preventing division by zero (Default value = 1e-7).
stop_gradients (bool, default: True) – Whether to stop the gradients of the variables after each gradient step. Default is True.
out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The new function weights ws_new, and also new mw and vw, following the adam updates.

Examples

With ivy.Array inputs:

>>> w = ivy.array([1., 2, 3.])
>>> dcdw = ivy.array([0.2,0.1,0.3])
>>> lr = ivy.array(0.1)
>>> vw_tm1 = ivy.zeros(1)
>>> mw_tm1 = ivy.zeros(3)
>>> step = 2
>>> updated_weights = w.adam_update(dcdw, lr, mw_tm1, vw_tm1, step)
>>> print(updated_weights)
(ivy.array([0.92558753, 1.92558873, 2.92558718]),
ivy.array([0.02, 0.01, 0.03]),
ivy.array([4.00000063e-05, 1.00000016e-05, 9.00000086e-05]))

gradient_descent_update(dcdw, lr, /, *, stop_gradients=True, out=None)[source]#

ivy.Array instance method variant of ivy.gradient_descent_update. This method simply wraps the function, and so the docstring for ivy.gradient_descent_update also applies to this method with minimal changes.

Parameters:

self (Array) – Weights of the function to be updated.
dcdw (Union[Array, NativeArray]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].
lr (Union[float, Array, NativeArray]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.
stop_gradients (bool, default: True) – Whether to stop the gradients of the variables after each gradient step. Default is True.
out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The new weights, following the gradient descent updates.

Examples

With ivy.Array inputs:

>>> w = ivy.array([[1., 2, 3],
...                [4, 6, 1],
...                [1, 0, 7]])
>>> dcdw = ivy.array([[0.5, 0.2, 0.1],
...                   [0.3, 0.6, 0.4],
...                   [0.4, 0.7, 0.2]])
>>> lr = ivy.array(0.1)
>>> new_weights = w.gradient_descent_update(dcdw, lr, stop_gradients = True)
>>> print(new_weights)
ivy.array([[ 0.95,  1.98,  2.99],
...        [ 3.97,  5.94,  0.96],
...        [ 0.96, -0.07,  6.98]])

lamb_update(dcdw, lr, mw_tm1, vw_tm1, step, /, *, beta1=0.9, beta2=0.999, epsilon=1e-07, max_trust_ratio=10, decay_lambda=0, stop_gradients=True, out=None)[source]#

ivy.Array instance method variant of ivy.lamb_update. This method simply wraps the function, and so the docstring for ivy.lamb_update also applies to this method with minimal changes.

Parameters:

self (Array) – Weights of the function to be updated.
dcdw (Union[Array, NativeArray]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].
lr (Union[float, Array, NativeArray]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.
mw_tm1 (Union[Array, NativeArray]) – running average of the gradients, from the previous time-step.
vw_tm1 (Union[Array, NativeArray]) – running average of second moments of the gradients, from the previous time-step.
step (int) – training step.
beta1 (float, default: 0.9) – gradient forgetting factor (Default value = 0.9).
beta2 (float, default: 0.999) – second moment of gradient forgetting factor (Default value = 0.999).
epsilon (float, default: 1e-07) – divisor during adam update, preventing division by zero (Default value = 1e-7).
max_trust_ratio (Union[int, float], default: 10) – The maximum value for the trust ratio. Default is 10.
decay_lambda (float, default: 0) – The factor used for weight decay. Default is zero.
stop_gradients (bool, default: True) – Whether to stop the gradients of the variables after each gradient step. Default is True.
out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The new function weights ws_new, following the LAMB updates.

Examples

With ivy.Array inputs:

>>> w = ivy.array([1., 2, 3])
>>> dcdw = ivy.array([0.5,0.2,0.1])
>>> lr = ivy.array(0.1)
>>> vw_tm1 = ivy.zeros(1)
>>> mw_tm1 = ivy.zeros(3)
>>> step = ivy.array(1)
>>> new_weights = w.lamb_update(dcdw, lr, mw_tm1, vw_tm1, step)
>>> print(new_weights)
(ivy.array([0.784, 1.78 , 2.78 ]),
ivy.array([0.05, 0.02, 0.01]),
ivy.array([2.5e-04, 4.0e-05, 1.0e-05]))

lars_update(dcdw, lr, /, *, decay_lambda=0, stop_gradients=True, out=None)[source]#

ivy.Array instance method variant of ivy.lars_update. This method simply wraps the function, and so the docstring for ivy.lars_update also applies to this method with minimal changes.

Parameters:

self (Array) – Weights of the function to be updated.
dcdw (Union[Array, NativeArray]) – Derivates of the cost c with respect to the weights ws, [dc/dw for w in ws].
lr (Union[float, Array, NativeArray]) – Learning rate, the rate at which the weights should be updated relative to the gradient.
decay_lambda (float, default: 0) – The factor used for weight decay. Default is zero.
stop_gradients (bool, default: True) – Whether to stop the gradients of the variables after each gradient step. Default is True.
out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The new function weights ws_new, following the LARS updates.

Examples

With ivy.Array inputs:

>>> w = ivy.array([[3., 1, 5],
...                [7, 2, 9]])
>>> dcdw = ivy.array([[0.3, 0.1, 0.2],
...                   [0.1, 0.2, 0.4]])
>>> lr = ivy.array(0.1)
>>> new_weights = w.lars_update(dcdw, lr, stop_gradients = True)
>>> print(new_weights)
ivy.array([[2.34077978, 0.78025991, 4.56051969],
...        [6.78026009, 1.56051981, 8.12103939]])

optimizer_update(effective_grad, lr, /, *, stop_gradients=True, out=None)[source]#

ivy.Array instance method variant of ivy.optimizer_update. This method simply wraps the function, and so the docstring for ivy.optimizer_update also applies to this method with minimal changes.

Parameters:

self (Array) – Weights of the function to be updated.
effective_grad (Union[Array, NativeArray]) – Effective gradients of the cost c with respect to the weights ws, [dc/dw for w in ws].
lr (Union[float, Array, NativeArray]) – Learning rate(s), the rate(s) at which the weights should be updated relative to the gradient.
stop_gradients (bool, default: True) – Whether to stop the gradients of the variables after each gradient step. Default is True.
out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The new function weights ws_new, following the optimizer updates.

Examples

>>> w = ivy.array([1., 2., 3.])
>>> effective_grad = ivy.zeros(3)
>>> lr = 3e-4
>>> ws_new = w.optimizer_update(effective_grad, lr)
>>> print(ws_new)
ivy.array([1., 2., 3.])

stop_gradient(*, preserve_type=True, out=None)[source]#

ivy.Array instance method variant of ivy.stop_gradient. This method simply wraps the function, and so the docstring for ivy.stop_gradient also applies to this method with minimal changes.

Parameters:

self (Array) – Array for which to stop the gradient.
preserve_type (bool, default: True) – Whether to preserve gradient computation on ivy.Array instances. Default is True.
out (Optional[Array], default: None) – optional output array, for writing the result to. It must have a shape that the inputs broadcast to.

Return type:

Array

Returns:

ret – The same array x, but with no gradient information.

Examples

>>> x = ivy.array([1., 2., 3.])
>>> y = x.stop_gradient(preserve_type=True)
>>> print(y)
ivy.array([1., 2., 3.])

This should have hopefully given you an overview of the gradients submodule, if you have any questions, please feel free to reach out on our discord!