Gradient is the measure of change in output of function, for change in it's input, and is measured as a derivative of a function that has more than one input variable(slope).

  • The combination of all the derivatives for all the parameters is the loss gradient
  • A gradient measures how much the output of a function changes if you change the inputs a little bit.
  • getting backpropagation to behave well requires gradients that are smooth, that is, the slope doesn’t change very quickly as you make small steps in any direction.
  • gradient is well conditioned: it’s not radically larger in one direction than another.
  • in machine learning, a gradient is a derivative of a function that has more than one input variable. Known as the slope of a function in mathematical terms, the gradient simply measures the change in all weights about the change in error.

Gradient Descent

Gradient Descent is an Optimization technique used to tune the coefficient and bias of a linear equation. It can be used to study how the output changes when input is changed.

Gradient descent adjusts parameters to minimize particular functions to local minima.

In linear regression, it finds weight and biases, and deep learning backward propagation uses the method.

The algorithm objective is to identify model parameters like weight and bias that reduce model error on training data.

In order for our model to fit data the best way possible, we would have to to find the global minimum of the cost function. However, finding that global minimum and changing all those parameters is usually very costly and time-consuming. That is why we are using iterative optimization techniques like gradient descent.

gradient groups all partial derivatives, the gradient is just the vector containing all the partial derivatives. In essence, it generalizes derivatives to scalar functions of several variables.

Types of Gradient Descent:

  • Batch gradient descent
  • Stochastic gradient descent (SGD)
  • Mini-batch gradient descent