### Approximation and Complexity in Numerical Optimization: Continuous and Discrete Problems

They mean just that the found solution was better than solutions found by other competitors this is especially true for highly dimensional global optimization problems where the global solutions are unknown. That is why the possibility to compare the found solutions with the known global optimum offered by the generator of classes of test functions 21 is very precious. Let us describe now two groups of methods used in different communities and studied here.

Metaheuristic algorithms widely used to solve in sense of the statement P2 discussed above real-life global optimization problems have a number of attractive properties that have ensured their success among engineers and practitioners. First, they have limpid nature-inspired interpretations explaining how these algorithms simulate behavior of populations of individuals.

Algorithms of this type studied here are: Particle Swarm Optimization PSO simulating fish schools 11 , Firefly Algorithm FA simulating the flashing behavior of the fireflies 13 , Artificial Bee Colony ABC representing a colony of bees in searching the food sources 14 , Differential Evolution DE and Genetic algorithms GA simulating the evolution on a phenotype and genotype level, respectively 4 , 6. Other reasons that have led to a wide spread of metaheuristics are the following: it is not required to have a high level mathematical preparation to understand them; their implementation usually is simple and many codes are available for free; finally, they do not need a lot of memory working at each moment with only a limited population of points in the search domain.

In fact, populations used by these methods can degenerate prematurely, returning only a locally optimal solution instead of the global one or even non locally optimal point if it has been obtained at one of the last evaluations of f x and the budget of M evaluations has not allowed to proceed with an improvement of the obtained solution.

Deterministic algorithms belonging to the second group of methods studied here are based on the knowledge that the objective function f x satisfies the Lipschitz condition 2. Lipschitz global optimization algorithms is a well-studied class of deterministic methods 1 — 3 , 5 , 7 , These methods are usually technically more sophisticated than metaheuristics, their implementation is not so easy, they require more memory and a higher mathematical preparation is necessary to understand and to use them. Commonly, they have a strong theory ensuring convergence to the global solution and a small number of control parameters allowing so their users to configure the search easily.

Even though the Lipschitz constant L can be unknown, there exist several strategies for its estimation 2 , 3 , 5 , 7 , 18 and one of the most frequently used techniques 16 works with all possible values of L from zero to infinity simultaneously. All deterministic algorithms considered here use it. How can one compare these two groups of methods? On the one hand, there exist several approaches for a visual comparison of deterministic algorithms see, e. However, they do not allow one to compare stochastic methods. On the other hand, comparison of metaheuristics often is performed on different collections of single benchmark problems 15 , 20 , As a result, the difficulty of test problems in collections can vary significantly leading sometimes to non homogeneous and, as a consequence, non reliable results.

An additional difficulty consists of the fact that, due to a stochastic nature of metaheuristics, the obtained results cannot be repeated and have a character of some averages.

Thus, the difficulties existing in performing a reliable comparison of these two groups of methods constitutes a serious gap between the respective communities. The goal of this paper is to start a dialog between them by proposing a methodology allowing one to compare numerically deterministic algorithms and stochastic metaheuristics using the problem statement P1.

## NA Digest, V. 00, # 19

Instead of traditional comparisons executed just on several dozens of tests 1 , 2 , 15 , 16 , 19 , 20 in this contribution more than , runs on randomly generated test problems 21 have been performed for a systematic comparison of the methods. In order to make this comparison more reliable, parameters of all tested algorithms were fixed following recommendations of their authors and then were used in all the experiments. One known and two novel methodologies for comparing global optimization algorithms are applied here: Operational Characteristics 25 for comparing deterministic algorithms and new Operational Zones and Aggregated Operational Zones generalizing ideas of operational characteristics to collate multidimensional stochastic algorithms.

An operational characteristic 25 constructed on a class of randomly generated test functions is a graph showing the number of solved problems in dependence on the number of executed evaluations of the objective function f x. To construct classes of test functions required to build operational characteristics, the popular GKLS generator 21 of multidimensional, multiextremal test functions was used. This generator allows one to generate randomly classes of test problems having the same dimension, number of local minima, and difficulty.

The property making this generator especially attractive consists of the fact that for each function a complete information of coordinates and values of all local minima including the global one is provided. Here, 8 different classes from 18 were used see supplementary materials for their description and for definition of what does it mean that a problem has been solved.

These classes and the respective search accuracies have been taken since they represent a well established tool used frequently to compare deterministic global optimization algorithms 18 , 28 — Higher is a characteristic of a method with respect to characteristics of its competitors better is the behavior of this method. Operational characteristics allow us also to see the best performers in dependence on the available budget of evaluations of f x.

For instance, it can be seen from Fig. Construction of operational characteristics for deterministic methods and of the operational zone for metaheuristic Firefly Algorithm FA built on the hard 5-dimensional class of GKLS test functions. The upper and the lower boundaries of the zone are shown as dark blue curves. Since operational characteristics cannot be used to compare stochastic methods, we propose in this paper a new methodology called operational zones that can be used for collating stochastic algorithms. Then, each run of a tested metaheuristic was considered as a particular method and its operational characteristic was constructed.

The totality of all operational characteristics form the respective operational zone see Fig. Then, the upper and the lower boundaries of the zone shown in Fig. The graph for the average performance within the zone can be also depicted see Fig. The joint representation of operational zones together with characteristics offers a lot of visual information.

It can be seen, for example, in Fig. If the budget is less than 30, trials see Fig. If the budget is higher than 40, trials than ADC behaves better since its characteristic is higher than the upper boundary of this FA zone. Notice also that Fig. For the same two test classes, Fig. One can see also that in many runs metaheuristics got trapped into local minima and were not able to exit from their attraction regions producing so operational zones with long horizontal parts see, e.

This means that increasing the number of trials does not improve results in this case and it is necessary to restart metaheuristics. Aggregated operational zones proposed in this paper show what happens in this case. They are constructed as follows. Then, for non-solved problems the algorithm is launched again with the same number n max of allowed trials.

In this way, T runs are executed to complete the aggregated characteristic. The lower and upper boundaries are defined analogously. Results of the experiments. For each test class the average number of trials required to solve all problems is presented for each deterministic algorithm. In this case, the maximal number of trials set to 10 6 was used to calculate the average number of trials m. It should be stressed that the aggregated operational zones allow one to emphasize better the potential of nature-inspired metaheuristics.

It can be seen from Fig.

In contrast, the aggregated zone of ABC is higher than the characteristics of both deterministic methods, i. Notice that for deterministic methods and metaheuristics, due to the stochastic nature of the latter ones, different averages should be used: for metaheuristics the results on 10, runs for each class are used, whereas for the deterministic algorithms results on runs one run for each of functions. This creates difficulties in comparing. To see the detailed results, larger tables with hundreds of rows and columns should be used, complicating so the visual analysis of the results.

In contrast, operational zones very well present visually performance of tested methods giving the entire panorama of their behavior for different budgets. The average, the best, and the worst cases for each metaheuristic can be easily obtained from the graphs for any chosen number of trials. Let us see now another way for a statistical comparison of the two groups of algorithms using the same data. Let X A C be a random variable describing the consumed percentage of the computational budget N max performed by an algorithm A for solving a problem from the test class C.

Then, after the construction of the cumulative distribution functions F X A C x , one can obtain the sampled distribution quantiles of X A C. These results can be interpreted as follows. For each algorithm, quantiles Q 25 , Q 50 , Q 75 and Q 90 for the number of trials for simple test classes are presented. For each algorithm, quantiles Q 25 , Q 50 , Q 75 , and Q 90 for the number of trials for hard test classes are presented.

In conclusion, the proposed operational zones and aggregated operational zones allow one to compare effectively deterministic and stochastic global optimization algorithms having different nature and give a handy visual representation of this comparison for different computational budgets. Nature-inspired metaheuristics and deterministic Lipschitz algorithms have been compared on of tests giving so a new understanding for both classes of methods and opening a dialog between the two communities.

It can be seen that both classes of algorithms are competitive and surpass one another in dependence on the available budget of function evaluations. Electronic supplementary material. Supplementary information accompanies this paper at Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. National Center for Biotechnology Information , U.

Sci Rep. Published online Jan Sergeyev , 1, 2 D. Kvasov , 1, 2 and M. Mukhametzhanov 1, 2. Author information Article notes Copyright and License information Disclaimer. Sergeyev, Email: ti. Corresponding author. Received Jul 14; Accepted Dec This article has been cited by other articles in PMC. Abstract Global optimization problems where evaluation of the objective function is an expensive operation arise frequently in engineering, decision making, optimal control, etc. Introduction Continuous global optimization problems arise frequently in many real-life applications 1 — 7 : in engineering, statistics, decision making, optimal control, machine learning, etc.

Results An operational characteristic 25 constructed on a class of randomly generated test functions is a graph showing the number of solved problems in dependence on the number of executed evaluations of the objective function f x. Then, minimize that slack variable until slack is null or negative.

The extreme value theorem of Karl Weierstrass states that a continuous real-valued function on a compact set attains its maximum and minimum value. More generally, a lower semi-continuous function on a compact set attains its minimum; an upper semi-continuous function on a compact set attains its maximum point or view. One of Fermat's theorems states that optima of unconstrained problems are found at stationary points , where the first derivative or the gradient of the objective function is zero see first derivative test.

More generally, they may be found at critical points , where the first derivative or gradient of the objective function is zero or is undefined, or on the boundary of the choice set. An equation or set of equations stating that the first derivative s equal s zero at an interior optimum is called a 'first-order condition' or a set of first-order conditions. Optima of equality-constrained problems can be found by the Lagrange multiplier method. While the first derivative test identifies points that might be extrema, this test does not distinguish a point that is a minimum from one that is a maximum or one that is neither. When the objective function is twice differentiable, these cases can be distinguished by checking the second derivative or the matrix of second derivatives called the Hessian matrix in unconstrained problems, or the matrix of second derivatives of the objective function and the constraints called the bordered Hessian in constrained problems.

The conditions that distinguish maxima, or minima, from other stationary points are called 'second-order conditions' see ' Second derivative test '. If a candidate solution satisfies the first-order conditions, then satisfaction of the second-order conditions as well is sufficient to establish at least local optimality.

The envelope theorem describes how the value of an optimal solution changes when an underlying parameter changes. The process of computing this change is called comparative statics. The maximum theorem of Claude Berge describes the continuity of an optimal solution as a function of underlying parameters. For unconstrained problems with twice-differentiable functions, some critical points can be found by finding the points where the gradient of the objective function is zero that is, the stationary points. More generally, a zero subgradient certifies that a local minimum has been found for minimization problems with convex functions and other locally Lipschitz functions.

Further, critical points can be classified using the definiteness of the Hessian matrix : If the Hessian is positive definite at a critical point, then the point is a local minimum; if the Hessian matrix is negative definite, then the point is a local maximum; finally, if indefinite, then the point is some kind of saddle point. Constrained problems can often be transformed into unconstrained problems with the help of Lagrange multipliers. Lagrangian relaxation can also provide approximate solutions to difficult constrained problems.

• An Algorithm for the Mixed Transportation Network Design Problem.
• Mathematical optimization - Wikipedia?
• Alamo Traces?
• Knowing and Being: Perspectives on the Philosophy of Michael Polanyi.
• Combinatorial optimization.

When the objective function is Convex function Convex , then any local minimum will also be a global minimum. There exist efficient numerical techniques for minimizing convex functions, such as interior-point methods. To solve problems, researchers may use algorithms that terminate in a finite number of steps, or iterative methods that converge to a solution on some specified class of problems , or heuristics that may provide approximate solutions to some problems although their iterates need not converge.

An optimization algorithm is a procedure which is executed iteratively by comparing various solutions until an optimum or a satisfactory solution is found. Optimization algorithms help us to minimize or maximize an objective function E x with respect to the internal parameters of a model mapping a set of predictors X to target values Y. There are three types of optimization algorithms which are widely used; Zero order algorithms, First Order Optimization Algorithms and Second Order Optimization Algorithms. Zero-order or derivative-free algorithms use only the criterion value at some positions.

These algorithms minimize or maximize a Loss function E x using its Gradient values with respect to the parameters. The First order derivative displays whether the function is decreasing or increasing at a particular point. First order Derivative basically will provide us a line which is tangential to a point on its Error Surface.

It is the most popular optimization algorithm used in optimizing a Neural Network. Gradient descent is used to update Weights in a Neural Network Model, i. A Neural Network trains via a technique called Back-propagation, in which propagating forward calculating the dot product of Inputs signals and their corresponding Weights and then applying an activation function to those sum of products, which transforms the input signal to an output signal and also is important to model complex Non-linear functions and introduces Non-linearity to the Model which enables the Model to learn almost any arbitrary functional mapping.

Second-order methods use the second order derivative which is also called Hessian to minimize or maximize the loss function. Since the second derivative is costly to compute, the second order is not used much. The second order derivative informs us whether the first derivative is increasing or decreasing which hints at the function's curvature. It also provides us with a quadratic surface which touches the curvature of the Error Surface. The iterative methods used to solve problems of nonlinear programming differ according to whether they evaluate Hessians , gradients, or only function values.

While evaluating Hessians H and gradients G improves the rate of convergence, for functions for which these quantities exist and vary sufficiently smoothly, such evaluations increase the computational complexity or computational cost of each iteration. In some cases, the computational complexity may be excessively high. One major criterion for optimizers is just the number of required function evaluations as this often is already a large computational effort, usually much more effort than within the optimizer itself, which mainly has to operate over the N variables.

Applied Optimization - Sequential Quadratic Approximation

The derivatives provide detailed information for such optimizers, but are even harder to calculate, e. However, gradient optimizers need usually more iterations than Newton's algorithm. Which one is best with respect to the number of function calls depends on the problem itself. More generally, if the objective function is not a quadratic function, then many optimization methods use other methods to ensure that some subsequence of iterations converges to an optimal solution.

The first and still popular method for ensuring convergence relies on line searches , which optimize a function along one dimension. A second and increasingly popular method for ensuring convergence uses trust regions. Both line searches and trust regions are used in modern methods of non-differentiable optimization.

Usually a global optimizer is much slower than advanced local optimizers such as BFGS , so often an efficient global optimizer can be constructed by starting the local optimizer from different starting points. Besides finitely terminating algorithms and convergent iterative methods , there are heuristics. A heuristic is any algorithm which is not guaranteed mathematically to find the solution, but which is nevertheless useful in certain practical situations.

List of some well-known heuristics:. Problems in rigid body dynamics in particular articulated rigid body dynamics often require mathematical programming techniques, since you can view rigid body dynamics as attempting to solve an ordinary differential equation on a constraint manifold;  the constraints are various nonlinear geometric constraints such as "these two points must always coincide", "this surface must not penetrate any other", or "this point must always lie somewhere on this curve".

Also, the problem of computing contact forces can be done by solving a linear complementarity problem , which can also be viewed as a QP quadratic programming problem. Many design problems can also be expressed as optimization programs. This application is called design optimization. One subset is the engineering optimization , and another recent and growing subset of this field is multidisciplinary design optimization , which, while useful in many problems, has in particular been applied to aerospace engineering problems.

This approach may be applied in cosmology and astrophysics. Economics is closely enough linked to optimization of agents that an influential definition relatedly describes economics qua science as the "study of human behavior as a relationship between ends and scarce means" with alternative uses. In microeconomics, the utility maximization problem and its dual problem , the expenditure minimization problem , are economic optimization problems. Insofar as they behave consistently, consumers are assumed to maximize their utility , while firms are usually assumed to maximize their profit.

Also, agents are often modeled as being risk-averse , thereby preferring to avoid risk. Asset prices are also modeled using optimization theory, though the underlying mathematics relies on optimizing stochastic processes rather than on static optimization. International trade theory also uses optimization to explain trade patterns between nations. The optimization of portfolios is an example of multi-objective optimization in economics.

Since the s, economists have modeled dynamic decisions over time using control theory. Some common applications of optimization techniques in electrical engineering include active filter design,  stray field reduction in superconducting magnetic energy storage systems, space mapping design of microwave structures,  handset antennas,    electromagnetics-based design.

Electromagnetically validated design optimization of microwave components and antennas has made extensive use of an appropriate physics-based or empirical surrogate model and space mapping methodologies since the discovery of space mapping in Optimization has been widely used in civil engineering.

The most common civil engineering problems that are solved by optimization are cut and fill of roads, life-cycle analysis of structures and infrastructures,  resource leveling ,  water resource allocation , and schedule optimization. Another field that uses optimization techniques extensively is operations research. Increasingly, operations research uses stochastic programming to model dynamic decisions that adapt to events; such problems can be solved with large-scale optimization and stochastic optimization methods.

Mathematical optimization is used in much modern controller design. High-level controllers such as model predictive control MPC or real-time optimization RTO employ mathematical optimization. These algorithms run online and repeatedly determine values for decision variables, such as choke openings in a process plant, by iteratively solving a mathematical optimization problem including constraints and a model of the system to be controlled.

Optimization techniques are regularly used in geophysical parameter estimation problems. Given a set of geophysical measurements, e. Nonlinear optimization methods are widely used in conformational analysis. Optimization techniques are used in many facets of computational systems biology such as model building, optimal experimental design, metabolic engineering, and synthetic biology.

## Dr. Panos M. Pardalos

From Wikipedia, the free encyclopedia. For the peer-reviewed journal, see Mathematical Programming. For other uses, see Optimization disambiguation and Optimum disambiguation. Main article: Optimization problem. Main article: Arg max. Tyrrell Rockafellar Naum Z. Shor Albert Tucker. Main article: Multi-objective optimization. Main article: Karush—Kuhn—Tucker conditions.

### Recommended for you

See also: Critical point mathematics , Differential calculus , Gradient , Hessian matrix , Positive definite matrix , Lipschitz continuity , Rademacher's theorem , Convex function , and Convex analysis. See also: List of optimization algorithms. This section may have been copied and pasted from another location, possibly in violation of Wikipedia's copyright policy.

Please be sure that the supposed source of the copyright violation is not itself a Wikipedia mirror. May Main article: Iterative method. See also: Newton's method in optimization , Quasi-Newton method , Finite difference , Approximation theory , and Numerical analysis.

Main article: Heuristic algorithm. Memetic algorithm Differential evolution Evolutionary algorithms Dynamic relaxation Genetic algorithms Hill climbing with random restart Nelder-Mead simplicial heuristic : A popular heuristic for approximate minimization without calling gradients Particle swarm optimization Cuckoo search Gravitational search algorithm Artificial bee colony optimization Simulated annealing Stochastic tunneling Tabu search Reactive Search Optimization RSO  implemented in LIONsolver.

## Partitioning

Main article: Molecular modeling. Main article: Computational systems biology. Main article: List of optimization software. Brachistochrone Curve fitting Deterministic global optimization Goal programming Important publications in optimization Least squares Mathematical Optimization Society formerly Mathematical Programming Society Mathematical optimization algorithms Mathematical optimization software Process optimization Simulation-based optimization Test functions for optimization Variational calculus Vehicle routing problem.

In Floudas, C. Encyclopedia of Optimization. Boston: Springer. Erwin Diewert J Optimization algorithms for networks and graphs.