GRADIENT DESCENT
Gradient Descent is an algorithm to minimize a function by optimizing its parameter. It plays an important role in Machine Learning and Deep Learning. In Gradient Descent we start with a random guess and then we slowly move to the right answer or the best answer.
New Value = Old value – step size
Step size in terms of mathematical formula is known as multiple of learning rate and slope.
For eg. if we plot a graph of function f(x) = x2
It’s also being widely used in robotics, mechanical engineering and computer games. This method was proposed before the era of modern computers and there was an intensive development meantime which led to numerous improved versions of it but in this article, we’re going to use a basic/vanilla gradient descent implemented in Python.
Gradient Descent is used with the functions which are differentiable and convex.
Let’s take a quadratic equation,
F(x) = x2 – x + 3
Therefore, it’s first and second order derivative will be,
F’(x) = 2x – 1 F’’(x) = 2
Because the second derivative is greater than 0, Function is convex.
- for x<0: function is convex
- for 0<x<1: function is concave
- for x>1: function is convex again
Algorithm:
- choose a starting point (initialization)
- calculate gradient at this point
- make a scaled step in the opposite direction to the gradient (objective: minimize)
- repeat points 2 and 3 until one of the criteria is met:
- maximum number of iterations reached
- step size is smaller than the tolerance.
Anuj Singh
Student, Data Science
NMIMS, Indore