How to Achieve Good Parameter Optimization Using The New Methodology in UTMOST III Optimizer

1. Introduction

What is the goal of optimization?

The optimization is the task of finding the absolute best set of admissible conditions to achieve your objective, formulated in mathematical terms. The goal of optimization is, given a system, to find the setting of it's parameters so that to obtain the optimal performance. The performance of the system is given by an evaluation function. Optimization problems are commonly found in a wide range of fields, and it is also of central concern to many problems.

The basic approach in all cases is usually the same: User selects or designs a "merit-function" that measures the agreement between the data and the model with a particular choice of parameters. The merit function is generally designed so that small values represent close agreement. The parameters and the model are then adjusted to achieve a minimum in the merit function, yielding best-parameters set. The adjustment process is thus a problem in minimization in many dimensions. The computational wish is always the same: do it quickly and cheaply. Often the computational effort is dominated by the cost of evaluating the merit function (and perhaps its partial derivatives). In such cases, the wishes are sometimes replaced by a simple goal: Evaluate the merit-function as few times as possible.

Finding a global extremum is, in general, a very difficult problem.

Two standard methods are widely used:

find local extrema starting from widely varying starting values of the independent variables and then pick the most extreme of these

perturb a local extremum by taking a finite amplitude step away from it, and then see if your method returns you a better point, or always to the same one (Simulated Annealing)[1]

Unfortunately, there is no perfect optimization algorithm. and the choice of the optimization method is based on the following consideration: A selection must be made between methods that need only evaluations of the "merit-function" to minimize and those that also require evaluations of the derivatives of that functions. Algorithms using the derivative are somewhat more powerful than those using only the function, but not always enough so as to compensate for the additional calculations of derivatives.

Figure 1 describes the two different basic optimization methods. One is based on direction-set, the other not. The well known Levenberg-Marquardt[2] method is associated to the direction-set, the Downhill Simplex method and the Simulated annealing are issued from the no-direction set way.



Figure 1. The basic optimization methods.

 

It should take into account that optimization is always a question of compromise. The main problem of the optimization with spice model is that the cost of an evaluation of the merit-function is very high (i.e. it's very long). The Levenberg-Marquardt method implemented for many years in UTMOST optimizer limit the number of evaluation of the merit-function but also computes the derivatives. Now available, the Downhill Simplex method does not not compute the derivatives. In fact, our goal is to make available the Simulated Annealing method in which have demonstrated important successes on a variety of global optimization problems[3].

Reading the following pages you will find an explanation of the two optimization methods now available in UTMOST. The goal of explanations is to give a reader the bases of optimization theory in order to perform better optimizations using the two methods.

 

2. The Levenberg-Marquardt Method

2.1 Basic Theory
The Levenberg-Marquardt method is a standard of non-linear least-squares optimization routines[4] It defines a merit function and determines the best fit parameters by it's minimization. Given a trial values for the parameters the procedure improves the trial solution and is repeated until stop decreasing. Basically the model to be fitted is y = y(x;a) (2.1)

where a is the set of parameters and the merit function is:

The basic idea is to take a step down the gradient (steepest descent method), that we can write as:

Where is defined as a constant (the step).

It is conventional to define

and

Supposing that the merit function (a) is well approached by is quadratic form (sufficiently close to the minimum) and regarding the equation 2.3 we can write

This set is solved for the increment al that, added to the current approximation, gives the next approximation.

Given an initial guess for the set of parameters a, the Levenberg-Marquardt method is as follows:

 

(1) compute (a)

(2) Pick a modest value of

(3) Solve the system (2.6) for a and evaluate (a + a)

(4): If (a + a) (a) increase by a factor f and go back to (3)

(5): If ((a + a) (a)) , decrease , by a factor f, update the trial solution a a+a and go back to (3)

 

Also necessary is a condition for stopping and this is the goal of the following paragraph that will explain the start and stop criteria available in UTMOST.

2.2. Application
As you can see in Figure 2 there are many parameters that you can change in the UTMOST Optimizer Setup.


Figure 2. Optimizer Setup / Status screen (Levenberg-Marquardt)

 

All the optimizer parameters are interrelated. The stop criteria that can be specified is present to prevent the optimizer from performing unnecessary calculations when the convergence is not possible, or that the number of calculations needed to reach convergence is excessive. The termination code indicates the reason why the optimizer stops. If you are not satisfied with a optimization result, it will indicate which optimization parameters you may change.

Now, we can explain some of them.

Marquardt parameter, Marquardt scaling: These parameters are the l (Marquardt parameter) and the factor f (Marquardt scaling) that are used in the previous chapter. They are the most important parameter the methods used. In fact, when an optimization is performed the Marquardt parameters decrease or increase depending of the result of (a + a).

If the Marquardt parameters decrease, the optimizer is going to converge, this is an iter-pass. On the otherhand, if it increases, the optimizer is not on the right track and this is an iter-fail.To prevent it from performing unnecessary calculations, you can limit the number of maximum iter-fail (last row of the stop criteria column) or the Marquardt parameter (first row of the stop-criteria column). The number of iterations depends on the number of Spice parameters the user wants to optimize. If n is this number of parameters, a number of n iterations is a minimum.

In most general cases, when the results of the optimization are not convenient, one solution is to increase the Marquardt parameter value (this value can not be higher than 1) and at the same time to decrease the Marquardt scaling (that can never be lower than 1.2) before starting a new optimization.

 

3. The Downhill Simplex Method

3.1 Theory
The simplex optimization method is easy to understand (easier than Levenberg-Marquardt) and use. Trials are successively performed of a direction of improvement until the optimum solution is reached.

The simplex methods can handle many variables with only a few trials, and does not require any calculation of derivatives.

The simplex method is based on an initial design of k+1 trials, where k is the number of variables. A k+1 geometric figure in a k-dimensional space is called a simplex. The corners of this figure are called the vertices. With two variables the first simplex design is based on three trials, for three variables it is four trials, etc. This number of trials is also the minimum for defining a direction of improvement. Therefore, it is a timesaving and economical way to start an optimization project. After the initial trials the simplex process is sequential, with the addition and evaluation of one new trial at a time. The simplex searches systematically for the best levels of the control variables. The optimization process ends when the optimization objective is reached or when the responses cannot be improved further[5].

The algorithm is intended to make is own way downhill through the complexity of an N-dimensional topography until it encounters a (local, at least) minimum.

The downhill simplex method must be started not just with a single point, but with N+1 points, defining the initial simplex. Consider P0 as your starting point, you can take the other N points to be

Pi = P0 + ei

The eis are N unit vectors and is a constant which is the problem's characteristic length scale. In the UTMOST optimization method, this value is not a constant but dependent on the initial, the maximum and the minimum values of each parameter. Thus, this purely mathematical method keeps the physical meaning sense of the spice model.

The method takes a series of steps, moving the points of the simplex where the function is largest, through the opposite face of the simplex to a lower point. These steps are called reflections, and they are constructed to conserve the volume of the simplex (maintain in non degeneracy). Next the method takes the simplex in another direction to make a larger step, and find a minimum, it contracts itself in all directions, pulling itself in around its lowest (best) point.

The basic moves are shown in Figure 3. In this figure, it shows the possible moves for a step in the downhill Simplex method. The simplex, at the beginning of the step is a tetrahedron. The simplex, at each step, can be a reflection, and expansion or a contraction away from the high point. The sequence of such steps will always converge to the minimum of the function.


Figure 3. The basic moves of the simplex.

 

The basic simplex algorithm consists of a few rules. The first rule is to reject the trial with the least favorable response value in the current simplex. A new set of control variable levels is calculated, by reflection into the control variable space opposite the undesirable result. This new trial replaces the least favorable trial in the simplex. This leads to a new least favorable response in the simplex that, in turn, leads to another new trial, and so on. At each step you move away from the least favorable conditions.

By that the simplex will move steadily towards more favorable conditions. The second rule is never to return to control variable levels that have just been rejected. The calculated reflection in the control variables can also produce a least favorable result. Without this second rule the simplex would just oscillate between the two control variable levels.

3.2 Application
Contrary to the Levenberg-Marquardt method, you can see on the Figure 4 that there are only a few parameters to control with the downhill simplex method.


Figure 4. Optimizer Setup / Status screen (downhill Simplex).

 

You find the classical error stop criteria (rms, average and max error) and the maximum number of function evaluation. This stop criteria allows you to limit the optimization time. Then the algorithm make is own way through the complexity of an N dimensional topography.

The "merit-function" used in our algorithm is the rms (root mean square) error, which is the most commonly used.

With this method, there is no necessity to define iteration fail or iteration pass, the control parameter is the number of function evaluations shown in the status column. To reduce the optimization time, the best way is to decrease the maximum function evaluation in the stop criteria column. To perform better optimization, the best way is to increase the maximum allowed function evaluation, what will also increase the optimization time.

4. Conclusion

This article, which will allow users to perform better optimization, also presents the new optimization method now available in UTMOST. It makes the comparison between these two optmization methods, which have more advantages than drawbacks. The Levenberg-Marquardt method is very powerful but is not very useful to use when the optimizer is not converged. This article, which give the basics of the theory also explains what parameters to change in the optimizer in order to help it to converge.

On the otherhand, the Downhill Simplex method is very easy to use and preserves the physical meaning of the spice parameters. As it is based on a very simple theory, it is very easy to understand and does not have any parameters to tune so complicated as in the Levenberg-Marquardt method.

One efficient solution is to use the both methods when the first selection does not give acceptable results.This is a new feature that will allow to perform better and faster optimizations. The Simulated Annealing method, which has demonstrated important successes on a variety of global optimization problems will be the next step of the UTMOST optimizer. This method is based on a modified Downhill Simplex method. Simulated annealing is a global optimization method that distinguishes between different local optima. Starting from an initial point, the algorithm takes a step and the function is evaluated. When minimizing a function, any downhill step is accepted and the process repeats from this new point.

References

[1] S. Kirkpatrick, "Optimization by Simulated Annealing." Science, 220, pp. 671-680, 1983.

[2] JJ. More "Lecture Notes in mathematics". Numerical Analysis, vol 630, pp. 105-116.

[3] R. Desai, "Combining Simulated Annealing and local optimization for efficient global optimization.", proceeding of the 9th Florida AI research symposium, pp. 233-237, June 1996.

[4] Numerical recipes, the art of scientific computing, 1992.

[5] www.multisimplex.com