Preconditioner

In mathematics, preconditioning is the application of a transformation, called the preconditioner, that conditions a given problem into a form, more suitable for numerical solving methods. Preconditioning is related to reducing a condition number of the problem; the preconditioned problem is usually solved by an iterative method. In linear algebra and numerical analysis, a preconditioner P of a matrix A is a matrix such that P − 1 A has a smaller condition number than A, it is common to call T = P − 1 the preconditioner, rather than P, since P itself is explicitly available. In modern preconditioning, the application of T = P − 1, i.e. multiplication of a column vector, or a block of column vectors, by T = P − 1, is performed by rather sophisticated computer software packages in a matrix-free fashion, i.e. where neither P, nor T = P − 1 are explicitly available in a matrix form. Preconditioners are useful in iterative methods to solve a linear system A x = b for x since the rate of convergence for most iterative linear solvers increases because the condition number of a matrix decreases as a result of preconditioning.

Preconditioned iterative solvers outperform direct solvers, e.g. Gaussian elimination, for large for sparse, matrices. Iterative solvers can be used as matrix-free methods, i.e. become the only choice if the coefficient matrix A is not stored explicitly, but is accessed by evaluating matrix-vector products. Instead of solving the original linear system above, one may solve the right preconditioned system: A P − 1 P x = b via solving A P − 1 y = b for y and P x = y for x. Alternatively, one may solve the left preconditioned system: P − 1 = 0. Both systems give the same solution as the original system as long as the preconditioner matrix P is nonsingular; the left preconditioning is more common. The goal of this preconditioned system is to reduce the condition number of the left or right preconditioned system matrix P − 1 A or A P − 1, respectively; the preconditioned matrix P − 1 A or A P − 1 is never explicitly formed. Only the action of applying the preconditioner solve operation P − 1 to a given vector need to be computed in iterative methods.

There is a trade-off in the choice of P. Since the operator P − 1 must be applied at each step of the iterative linear solver, it should have a small cost of applying the P − 1 operation; the cheapest preconditioner would therefore be P = I since P − 1 = I. This results in the original linear system and the preconditioner does nothing. At the other extreme, the choice P = A gives P − 1 A = A P − 1 = I, which has optimal condition number of 1, requiring a single iteration for convergence. One therefore chooses P as somewhere between these two extremes, in an attempt to achieve a minimal number of linear iterations while keeping the operator P − 1 as simple as possible; some examples of typical preconditioning approaches are detailed below. Preconditioned iterative methods for A x − b = 0 are, in most cases, mathematically equivalent to standard iterative methods applied to the preconditioned system P − 1 = 0. {\displa

Ellipsoid method

In mathematical optimization, the ellipsoid method is an iterative method for minimizing convex functions. When specialized to solving feasible linear optimization problems with rational data, the ellipsoid method is an algorithm which finds an optimal solution in a finite number of steps; the ellipsoid method generates a sequence of ellipsoids whose volume uniformly decreases at every step, thus enclosing a minimizer of a convex function. The ellipsoid method has a long history; as an iterative method, a preliminary version was introduced by Naum Z. Shor. In 1972, an approximation algorithm for real convex minimization was studied by Arkadi Nemirovski and David B. Yudin; as an algorithm for solving linear programming problems with rational data, the ellipsoid algorithm was studied by Leonid Khachiyan: Khachiyan's achievement was to prove the polynomial-time solvability of linear programs. Following Khachiyan's work, the ellipsoid method was the only algorithm for solving linear programs whose runtime had been proved to be polynomial until Karmarkar's algorithm.

However, Karmarkar's interior-point method and variants of the simplex algorithm are much faster than the ellipsoid method in practice. Karmarkar's algorithm is faster in the worst case. However, the ellipsoidal algorithm allows complexity theorists to achieve bounds that depend on the dimension of the problem and on the size of the data, but not on the number of rows, so it remained important in combinatorial optimization theory for many years. Only in the 21st century have interior-point algorithms with similar complexity properties appeared. A convex minimization problem consists of a convex function f 0: R n → R to be minimized over the variable x, convex inequality constraints of the form f i ⩽ 0, where the functions f i are convex, linear equality constraints of the form h i = 0. We are given an initial ellipsoid E ⊂ R n defined as E = containing a minimizer x ∗, where P ≻ 0 and x 0 is the center of E. Finally, we require the existence of a cutting-plane oracle for the function f. One example of a cutting-plane is given by a subgradient g of f.

At the k-th iteration of the algorithm, we have a point x at the center of an ellipsoid E =. We query the cutting-plane oracle to obtain a vector g ∈ R n such that g T ⩽ 0. We therefore conclude that x ∗ ∈ E ∩. We set E to be the ellipsoid of minimal volume containing the half-ellipsoid

Golden-section search

The golden-section search is a technique for finding the extremum of a unimodal function by successively narrowing the range of values inside which the extremum is known to exist. The technique derives its name from the fact that the algorithm maintains the function values for triples of points whose distances form a golden ratio; the algorithm is the limit of Fibonacci search for a large number of function evaluations. Fibonacci search and golden-section search were discovered by Kiefer; the discussion here is posed in terms of searching for a minimum of a unimodal function. Unlike finding a zero, where two function evaluations with opposite sign are sufficient to bracket a root, when searching for a minimum, three values are necessary; the golden-section search is an efficient way to progressively reduce the interval locating the minimum. The key is to observe that regardless of how many points have been evaluated, the minimum lies within the interval defined by the two points adjacent to the point with the least value so far evaluated.

The diagram above illustrates a single step in the technique for finding a minimum. The functional values of f are on the vertical axis, the horizontal axis is the x parameter; the value of f has been evaluated at the three points: x 1, x 2, x 3. Since f 2 is smaller than either f 1 or f 3, it is clear that a minimum lies inside the interval from x 1 to x 3; the next step in the minimization process is to "probe" the function by evaluating it at a new value of x, namely x 4. It is most efficient to choose x 4 somewhere inside the largest interval, i.e. between x 2 and x 3. From the diagram, it is clear that if the function yields f 4 a a minimum lies between x 1 and x 4, the new triplet of points will be x 1, x 2, x 4. However, if the function yields the value f 4 b a minimum lies between x 2 and x 3, the new triplet of points will be x 2, x 4, x 3. Thus, in either case, we can construct a new narrower search interval, guaranteed to contain the function's minimum. From the diagram above, it is seen that the new search interval will be either between x 1 and x 4 with a length of a + c, or between x 2 and x 3 with a length of b.

The golden-section search requires. If they are not, a run of "bad luck" could lead to the wider interval being used many times, thus slowing down the rate of convergence. To ensure that b = a + c, the algorithm should choose x 4 = x 1 +. However, there still remains the question of where x 2 should be placed in relation to x 1 and x 3; the golden-section search chooses the spacing between these points in such a way that these points have the same proportion of spacing as the subsequent triple x 1, x 2, x 4 or x 2, x 4, x 3. By maintaining the same proportion of spacing throughout the algorithm, we avoid a situation in which x 2 is close to x 1 or x 3 and guarantee that the interval width shrinks by the same constant proportion in each step. Mathematically, to ensure that the spacing after evaluating f is proportional to the spacing prior to that evaluation, if f is f 4 a and our new triplet of points is x 1 {\displaystyle x_

Linear programming

Linear programming is a method to achieve the best outcome in a mathematical model whose requirements are represented by linear relationships. Linear programming is a special case of mathematical programming. More formally, linear programming is a technique for the optimization of a linear objective function, subject to linear equality and linear inequality constraints, its feasible region is a convex polytope, a set defined as the intersection of finitely many half spaces, each of, defined by a linear inequality. Its objective function is a real-valued affine function defined on this polyhedron. A linear programming algorithm finds a point in the polyhedron where this function has the smallest value if such a point exists. Linear programs are problems that can be expressed in canonical form as Maximize c T x subject to A x ≤ b and x ≥ 0 where x represents the vector of variables, c and b are vectors of coefficients, A is a matrix of coefficients, T is the matrix transpose; the expression to be maximized or minimized is called the objective function.

The inequalities Ax ≤ b and x ≥ 0 are the constraints which specify a convex polytope over which the objective function is to be optimized. In this context, two vectors are comparable. If every entry in the first is less-than or equal-to the corresponding entry in the second it can be said that the first vector is less-than or equal-to the second vector. Linear programming can be applied to various fields of study, it is used in mathematics, to a lesser extent in business and for some engineering problems. Industries that use linear programming models include transportation, telecommunications, manufacturing, it has proven useful in modeling diverse types of problems in planning, scheduling and design. The problem of solving a system of linear inequalities dates back at least as far as Fourier, who in 1827 published a method for solving them, after whom the method of Fourier–Motzkin elimination is named. In 1939 a linear programming formulation of a problem, equivalent to the general linear programming problem was given by the Soviet economist Leonid Kantorovich, who proposed a method for solving it.

It is a way he developed, during World War II, to plan expenditures and returns in order to reduce costs of the army and to increase losses imposed on the enemy. Kantorovich's work was neglected in the USSR. About the same time as Kantorovich, the Dutch-American economist T. C. Koopmans formulated classical economic problems as linear programs. Kantorovich and Koopmans shared the 1975 Nobel prize in economics. In 1941, Frank Lauren Hitchcock formulated transportation problems as linear programs and gave a solution similar to the simplex method. Hitchcock had died in 1957 and the Nobel prize is not awarded posthumously. During 1946–1947, George B. Dantzig independently developed general linear programming formulation to use for planning problems in US Air Force. In 1947, Dantzig invented the simplex method that for the first time efficiently tackled the linear programming problem in most cases; when Dantzig arranged a meeting with John von Neumann to discuss his simplex method, Neumann conjectured the theory of duality by realizing that the problem he had been working in game theory was equivalent.

Dantzig provided formal proof in an unpublished report "A Theorem on Linear Inequalities" on January 5, 1948. In the post-war years, many industries applied it in their daily planning. Dantzig's original example was to find the best assignment of 70 people to 70 jobs; the computing power required to test all the permutations to select the best assignment is vast. However, it takes only a moment to find the optimum solution by posing the problem as a linear program and applying the simplex algorithm; the theory behind linear programming drastically reduces the number of possible solutions that must be checked. The linear programming problem was first shown to be solvable in polynomial time by Leonid Khachiyan in 1979, but a larger theoretical and practical breakthrough in the field came in 1984 when Narendra Karmarkar introduced a new interior-point method for solving linear-programming problems. Linear programming is a used field of optimization for several reasons. Many practical problems in operations research can be expressed as linear programming problems.

Certain special cases of linear programming, such as network flow problems and multicommodity flow problems are considered important enough to have generated much research on specialized algorithms for their solution. A number of algorithms for other types of optimization problems work by solving LP problems as sub-problems. Ideas from linear programming have inspired many of the central concepts of optimization theory, such as duality and the importance of convexity and its generalizations. Linear programming was used in the early formation o

Frank–Wolfe algorithm

The Frank–Wolfe algorithm is an iterative first-order optimization algorithm for constrained convex optimization. Known as the conditional gradient method, reduced gradient algorithm and the convex combination algorithm, the method was proposed by Marguerite Frank and Philip Wolfe in 1956. In each iteration, the Frank–Wolfe algorithm considers a linear approximation of the objective function, moves towards a minimizer of this linear function. Suppose D is a compact convex set in a vector space and f: D → R is a convex differentiable real-valued function; the Frank–Wolfe algorithm solves the optimization problem Minimize f subject to x ∈ D. Initialization: Let k ← 0, let x 0 be any point in D. Step 1. Direction-finding subproblem: Find s k solving Minimize s T ∇ f Subject to s ∈ D Step 2. Step size determination: Set γ ← 2 k + 2, or alternatively find γ that minimizes f subject to 0 ≤ γ ≤ 1. Step 3. Update: Let x k + 1 ← x k + γ, let k ← k + 1 and go to Step 1. While competing methods such as gradient descent for constrained optimization require a projection step back to the feasible set in each iteration, the Frank–Wolfe algorithm only needs the solution of a linear problem over the same set in each iteration, automatically stays in the feasible set.

The convergence of the Frank–Wolfe algorithm is sublinear in general: the error in the objective function to the optimum is O after k iterations, so long as the gradient is Lipschitz continuous with respect to some norm. The same convergence rate can be shown if the sub-problems are only solved approximately; the iterates of the algorithm can always be represented as a sparse convex combination of the extreme points of the feasible set, which has helped to the popularity of the algorithm for sparse greedy optimization in machine learning and signal processing problems, as well as for example the optimization of minimum–cost flows in transportation networks. If the feasible set is given by a set of linear constraints the subproblem to be solved in each iteration becomes a linear program. While the worst-case convergence rate with O can not be improved in general, faster convergence can be obtained for special problem classes, such as some convex problems. Since f is convex, for any two points x, y ∈ D we have: f ≥ f + T ∇ f This holds for the optimal solution x ∗.

That is, f ≥ f + T ∇ f. The best lower bound with respect to a given point x is given by f ≥ f + T ∇ f ( x

Integer programming

An integer programming problem is a mathematical optimization or feasibility program in which some or all of the variables are restricted to be integers. In many settings the term refers to integer linear programming, in which the objective function and the constraints are linear. Integer programming is NP-complete. In particular, the special case of 0-1 integer linear programming, in which unknowns are binary, only the restrictions must be satisfied, is one of Karp's 21 NP-complete problems. If some decision variables are not discrete the problem is known as a mixed-integer programming problem. An integer linear program in canonical form is expressed as: maximize c T x subject to A x ≤ b, x ≥ 0, x ∈ Z n, an ILP in standard form is expressed as maximize c T x subject to A x + s = b, s ≥ 0, x ≥ 0, x ∈ Z n, where c, b are vectors and A is a matrix, where all entries are integers; as with linear programs, ILPs not in standard form can be converted to standard form by eliminating inequalities, introducing slack variables and replacing variables that are not sign-constrained with the difference of two sign-constrained variables The plot on the right shows the following problem.

Max y − x + y ≤ 1 3 x + 2 y ≤ 12 2 x + 3 y ≤ 12 x, y ≥ 0 x, y ∈ Z The feasible integer points are shown in red, the red dashed lines indicate their convex hull, the smallest polyhedron that contains all of these points. The blue lines together with the coordinate axes define the polyhedron of the LP relaxation, given by the inequalities without the integrality constraint; the goal of the optimization is to move the black dotted line as far upward while still touching the polyhedron. The optimal solutions of the integer problem are the points and which both have an objective value of 2; the unique optimum of the relaxation is with objective value of 2.8. Note that if the solution of the relaxation is rounded to the nearest integers, it is not feasible for the ILP; the following is a reduction from minimum vertex cover to integer programming that will serve as the proof of NP-hardness. Let G = be an undirected graph. Define a linear program as follows: min ∑ v ∈ V y v y v + y u ≥ 1 ∀ u v ∈ E y v ≥ 0 ∀ v ∈ V y v ∈ Z ∀ v ∈ V Given that the constraints limit y v to either 0 or 1, any feasible solution to the integer program is a subset of vertices.

The first constraint implies. Therefore, the solution describes a vertex cover. Additionally given some verte