|
Can anyone give me some reasons as to why you would use least squares instead of gradient descent? Whenever I've come across one being used instead of the other its primarily for compute purposes. But is there any advantage / disadvantage one would have over the other? |
The least square is the loss function which measures the goodness of fit for your model, compared to actual output values. The gradient descent is a technique to modify model's parameters in order to minimize the squared error. Thus, they are different subjects.
For different loss functions one can use, see http://slipguru.disi.unige.it/Downloads/publications/loss.pdf. For the gradient descent algorithm see the wikipedia article or the presentation Andrew Ng gives on the Machine learning course, on coursera.org. The answer at http://metaoptimize.com/qa/questions/11116/what-are-the-best-resourses-to-learn-optimization-for-machine-learning might also be of help for you.