Data Structures & Algorithms

We seek a local minimizer of a real-valued function, f(x), where x is a vector of real variables.

In other words, we seek a vector, x*, such that f(x*) <= f(x) for all x close to x*.

Note: Global optimization algorithms try to find an x* that minimizes f over all possible vectors x. This is a much harder problem to solve. At present, no efficient algorithm is known for performing this task.

For many applications, local minima are good enough, particularly when the user can draw on his/her own experience and provide a good starting point for the algorithm.

Gradient Methods

The idea here is to compute gradients (derivatives) of f(x) to monitor the change in its value as we alter x.

Suppose the goal is to minimize f(x). Then select the next value of x, call it x', (iterative step/ iterator) that has a "good" negative value of the gradient so that we are decreasing the new value of f(x').

variations of gradient methods:

how we calculate the gradient

how measure the best/"good" iterative setp to take for x'.

how we determine when we are done iterating (found the best local minimum)

if we are able to converge to a local minimum

how to select the starting point for x

convergence

the ability to eventually (or within some reasonable time) get to a local minimum (our answer)

often dependent on starting point being close enough to a minimum

often dependent on number of directions consider at one iteration as well as size of steps considering in the iterator (for one step in optimization algorithm, the various derivatives, hence x', you consider).

Unconstrained Optimization

Gradient Methods