Optimization Algorithms are responsible for solving Optimization Problem for a given function by finding the input parameters or arguments for that function that would result in the estimated minimum or maximum output of the Objective Function.

Optimization Algorithms improve learning efficiently of Machine Learning Models and help them converge to optimal solutions. It's often done by minimizing the cost function by updating model coefficients (for Regression) or Weights (for Artificial Neural Networks (ANN)).

Algorithms:

- First-order Optimization Algorithms: They are methods that use first-order derivatives of the function to perform optimization.
- Gradient Descent Algorithms
- RProp
- RMSprop
- Adam
- Adagrad
- Adadelta
- Adamax
- Momentum
- Nesterov accelerated gradient
- Nadam
- AMSGrad

- Second-order Optimization Algorithms: they are methods that use the estimation of The Hessian Matrix, utilizing the second derivative matrix of the loss function with respect to its parameters.
- Newton method
- Conjugate gradient
- Quasi-Newton method
- Levenberg-Marquardt algorithm.

Note:

- For functions to be optimizable, it must be a differentiable function(either univariate or multivariate).
**Gradient-based optimization**is the core of most optimization methods.- Among Gradient Descent Algorithms, Batch Gradient Descent is the most efficient, and Stochastic Gradient Descent(SGD) is more robust. Mini-batch Gradient Descent however is a good balance between the two, and therefor more commonly used.

Learning Material:

References:

Interactive Graph

Table Of Contents