Generalized Quadratic Augmented Lagrangian Methods with Nonmonotone Penalty Parameters

Zhu, Xunzhi; Zhou, Jinchuan; Pan, Lili; Zhao, Wenling

doi:https://doi.org/10.1155/2012/181629

Journal of Applied Mathematics

On this page

Abstract Introduction Conclusions Disclosure Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2012 | Article ID 181629 | https://doi.org/10.1155/2012/181629

Generalized Quadratic Augmented Lagrangian Methods with Nonmonotone Penalty Parameters

Xunzhi Zhu,¹Jinchuan Zhou,¹Lili Pan,¹and Wenling Zhao¹

Academic Editor: Ying U. Hu

Received15 Jan 2012

Accepted21 Mar 2012

Published12 Jun 2012

Abstract

For nonconvex optimization problem with both equality and inequality constraints, we introduce a new augmented Lagrangian function and propose the corresponding multiplier algorithm. New iterative strategy on penalty parameter is presented. Different global convergence properties are established depending on whether the penalty parameter is bounded. Even if the iterative sequence is divergent, we present a necessary and sufficient condition for the convergence of to the optimal value. Finally, preliminary numerical experience is reported.

1. Introduction

This paper is concerned with the following nonlinear programming problem: where for and for are all continuously differentiable functions, and is a nonempty and closed subset in . Denoted by the feasible region and by the solution set.

The classical Lagrangian function for this problem is where . It is well known that the above classic Lagrangian function would fail to find out the optimal solution of primal problem because a nonzero duality gap may be arisen for nonconvex programming. Augmented Lagrangians were proposed to overcome this difficulty. It combines the advantage from Lagrangian duality methods and penalty function methods; for example, it does not need to make the penalty parameter infinitely large, and hence successfully avoid ill-conditioning effects. In fact, if the penalty parameter is very large, the Lagrangian relaxation subproblem would be very difficult in the sense that the stabilized Newton step is not regarded as a candidate to be a decreasing direction.

Global convergence properties of augmented Lagrangian methods have been studied by many researchers [1–8]. It should be noted that the multiplier sequence generated by the algorithm may be unbounded. Several strategies have proposed to deal with these questions, such as adding constraint qualifications [5, 6] or via safeguarding projection [1, 2, 9, 10].

In this paper, for the optimization problem () with both equality and inequality constraints, we introduce a new class of augmented Lagrangian functions, which include the well-known quadratic augmented Lagrangian function as special case [11–13]. It should be noted that this function class is more general since we do not restrict to be convex (see (2.1) below for the definition of ). Convergence properties of the corresponding multiplier algorithm are studied. Specially, a new updated strategy on penalty parameter is proposed, that is, the penalty parameter is improved only when the multipliers are too large, and the iterative point is far away from the feasible region. This strategy guarantees that the penalty parameter remains unchanged, provided that measure of infeasibility of iterative point and bounding of multipliers have enough progress (see (2.4)). Furthermore, we study the performance of the multiplier sequences generated by our algorithm. Finally, compared with [4, 9, 10], we further consider the case when is divergent, in which a necessary and sufficient condition for converging to the optimal value is given as well.

The organization of this paper is as follows. In the next section, we propose the multiplier algorithm and study the global convergence properties. Preliminary numerical results are reported in Section 3. Conclusion is drawn in Section 4.

2. Multiplier Algorithms

A new generalized quadratic augmented Lagrangian function for () is defined as follows: where , and denotes the all positive real scalars, that is, . The function involved in (2.1) satisfies the following properties:continuously differentiable and strictly increasing on with and for .

Particularly, if for , then reduces to the quadratic augmented Lagrangian function. Compared with [9, 14, 15], an important point made above is that is not required to be convex. Hence, it makes the augmented Lagrangian we introduce here more general.

Given , the Lagrangian relaxation subproblem associated with the augmented Lagrangian is defined as: Its solution set is denoted by . We always assume throughout this paper that the set is nonempty, which is ensured if is compact, since , , and are all continuous. In addition, we assume that: is bounded from below, that is, .

It is a rather mild assumption in optimization problem because otherwise the objective function can be replaced by . Recall that a vector is said to be a stationary point of () if it is a feasible point and there exist for all and for such that the following system holds: where denotes the normal cone of at [16, Chapter 6]. Let denote the collection of multipliers satisfying (2.2). The set of is larger than that of R-multiplier defined in [17], which reduces to the well-known Karush-Kuhn-Tucker (KKT) point of () when , since the complementarity condition is required for KKT point. The following is the multiplier algorithm based on the generalized quadratic augmented Lagrangian . One of its main feature is that the Lagrangian multipliers associated with equality and inequality constraints are not restricted to be bounded.

Algorithm 2.1 (multiplier algorithm based on ). Step 1. (initialization)
Let and be an arbitrary positive sequence with zero as limit i.e., . Select an initial point , , , and . Set .
Step 2. (estimating multipliers)
Compute
Step 3. (updating penalty parameter)
Let
Step 4. (solving lagrangian relaxation subproblem)
Find .
Step 5. (cycling)
If and , then stop; otherwise, let and go back to Step 2.

Consider the perturbation of (); precisely, for any fixed , define the perturbation value function as follows: where It is clear that . In practical implementation of the previous algorithm, we always choose the constant large enough to strictly include , the set of subdifferential of at origin, since according to the strong duality theory, the optimal Lagrangian multipliers belong to ; see [18] for the detailed discussion. In [5, 6], the authors require the convergence of to zero. It follows from (2.5) that this nice property holds automatically in our new strategy. In addition, (2.4) and (2.5) indicate that Lagrangian relaxation subproblem needs to place more emphasis on decreasing the constraint violations via increasing the penalty parameter, provided that the current point is far away from feasible region.

Lemma 2.2. Let be the iterative sequence generated by Algorithm 2.1, and let be one of its accumulate points. If is bounded, then is a stationary point of ().

Proof. If the algorithm is terminated finitely, then the result follows immediately according to the stop criterion in Step 5. On the other hand, if the condition (2.5) occurs infinitely, then , and hence must be unbounded. Therefore, by hypothesis we knew that only condition (2.4) occurs when is large enough. Because and by (2.4), taking the limit yields and , since converges to zero. This shows the feasibility of . Meanwhile, we know from (2.4) that and are bounded. Hence, we can assume without loss of generality that and . Since is an optimal solution of minimizing over by Step 4, then according to the well-known optimality conditions we have which together with the formula of (2.3) and (2.4) yields Taking limits gives us where we have used the upper semicontinuity of normal cone [16, Chapter 6]. This completes the proof.

Now, we turn our main concern to the case when is unbounded.

Theorem 2.3. Let be the iterative sequence generated by Algorithm 2.1, and let be one of its accumulate points. If is unbounded, then is an optimal solution of ().

Proof. It is clear that if is unbounded, then the iteration formula (2.5) occurs infinitely. Hence, for simplification, we assume that (2.5) always happens as sufficiently large, and that converges to (not resorting to a subsequence). Taking into account of (2.5), the terms are all converges to zero. This fact will be used in the following analysis.
Let us first show that is a feasible point of (). We argue it by contradiction. Note first that where the third and fourth steps are due to the feasibility of . This implies that
Now, let us consider the following two cases.Case 1. There exists an index such that . Due to the continuity of , we must have for is large enough. Therefore, Noticing that as mentioned above and , we know that as is large enough which further implying This, together with (2.13), yields since approaches to , which contradicts (2.12).
Case 2. There exists an index such that . Similarly, due to the continuity of , we know that as is sufficiently large. Therefore, where the second inequality comes from the fact for all by , and the last inequality follows from the nonnegativity of by (2.3), since for all due to the monotonicity of . Invoking (2.5) and taking limits in the above inequality yields , which contradicts (2.12). So far, we have established the feasibility of .
It remains to show that is an optimal solution. In fact, notice that where the second inequality is due to (2.11). Taking limits in the above yields where the last step comes from the fact that and converge to zero (see (2.5)). This means that must be an optimal solution of (). The proof is complete.

Finally, let us deal with the performance of multiplier sequences and generated by our proposed algorithm.

Theorem 2.4. Let be an iterative sequence generated by Algorithm 2.1, and let be one of its accumulate points. The following statements hold:(i)if both and are bounded, then we have where and are accumulate points of and ;(ii)if either or is unbounded, then we have where and are accumulate points of and with

Proof. The proof is divided into the following two cases.
Case 1. Both and are bounded. In this case, we can obtain that is a stationary point of () by following almost the same argument as in Lemma 2.2.
Case 2. Either or is divergent. According to the updated strategy on penalty parameter , we know that (2.5) must occur as sufficiently large. Hence, is unbounded in this case. Since is a global optimal solution of over by Step 4 in Algorithm 2.1, applying the optimality condition to the augmented Lagrangian relaxation problem yields that is,
Since and are bounded, we can assume without loss of generality that Clearly, and are not all zeros. Dividing by in both sides of (2.23) and using the fact that is cone, we have Taking limits in the above yields This establishes the result as desired.

Now, a natural question arises: how does the algorithm perform if is divergent? The following theorem gives the answer.

Theorem 2.5. Let be an iterative sequence generated by Algorithm 2.1. If is unbounded, then the following statements are equivalent:(a) converges to the optimal value, that is, (b) is lower semicontinuous at from right, that is, ;(c) is continuous at from right, that is, .

Proof. . Suppose on the contrary that is not lower semicontinuous at zero from right, then there must exist and such that For any given , since we can choose a subsequence satisfying which together with the continuity of implies that
In addition, pick with . Then by (2.28).
Therefore, where the last step is due to the fact and since and is nondecreasing by . Taking the limits in both sides of (2.31) and using (2.27) and (2.30) yield which leads to a contradiction.
. Since converges to , it then follows from Theorem 2.3 that is an optimal solution of (). Hence, for any arbitrary , we have and (since and as is sufficiently large). The latter further implies that by definition (2.6), that is, Since is lower semicontinuous at from right by hypothesis, taking the lower limitation in (2.33) yields that is,
. It is clear.
. It follows by invoking the fact that , since for all . Then , that is, is upper semicontinuous at origin from right. This together with the lower semicontinuous of at origin by hypothesis yields the desired result.

3. Preliminary Numerical Reports

To give some insight into the behavior of our proposed algorithm presented in this paper, we solve two problems by letting take the following different functions:(1); (2); (3)(4)

The computer codes were written in Matlab 7.0, and the test was done at a PC of Pentium 4 with 2.8 GHz CPU and 1.99 GB memory. We implement our algorithm to solve the following programming problems, where Example 3.1 is obtained by adding an equality constraint in Example 6.3 in [19], and Example 3.2 comes from [20]. The corresponding numerical results are reported below, where is the number of iterations, is the penalty parameter, is iterative point, and is the objective function value.

Example 3.1. Consider the following nonconvex programming:

Example 3.2.

In the implementation of our algorithm, we use the BFGS quasi-Newton method with a mixed quadratic and cubic line search procedure to solve the Lagrangian subproblem: (i)for Example 3.1, the initial data are , , , , and , ; remains unchanged because is taken large enough to ensure the validity of (2.4); the obtained solution is a stationary point with the corresponding multipliers and , and the inequality is active;(ii)for Example 3.2, the result starts from the nonfeasible point , , , , , and ; the obtained solution is also a stationary point with the corresponding multipliers and , and the first inequality constraint is active.

The above preliminary numerical result as shown in Tables 1 and 2 and Figures 1 and 2 indicates that Algorithm 2.1 may have better numerical performance, when is nonlinear (see and ), and is nonconvex (see ) because the decreasing steep is more “quickly” in the beginning of the iterative, which leads to the requirement of fewer iterative steps. Our next research topic is to further study the performance of our algorithm for solving various practical questions.

4. Conclusions

We introduce a new class of generalized quadratic augmented Lagrangian function with nonconvex functions. Global convergence properties of the corresponding multiplier algorithm are proposed without requiring the boundedness of Lagrangian multiplier sequence. Penalty parameter is improved only if multiplier sequences are too large and the iterative point is far away from feasible region. Necessary and sufficient conditions for the convergence of to the optimal value are established as well.

Disclosure

The research of J. Zhou and X. Zhu was partly supported by National Natural Science Foundation of China (11101248), Shandong Province Natural Science Foundation (ZR2010AQ026), and Young Teacher Support Program of Shandong University of Technology. The research of L. Pan was partly supported by National Natural Science Foundation of China (11171247 and 71101140). The research of W. Zhao was partly supported by National Natural Science Foundation of China (10971118).

Acknowledgments

The authors appreciate very much the constructive and excellent comments from reviewers, which greatly improved the presentation of the paper. In particular, one of reviewers drew our attention to the paper [17], where the detailed relationship between various major constraint qualifications is given.

References

R. Andreani, E. G. Birgin, J. M. Martínez, and M. L. Schuverdt, “On augmented Lagrangian methods with general lower-level constraints,” SIAM Journal on Optimization, vol. 18, no. 4, pp. 1286–1309, 2007.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
R. Andreani, E. G. Birgin, J. M. Martínez, and M. L. Schuverdt, “Augmented Lagrangian methods under the constant positive linear dependence constraint qualification,” Mathematical Programming, vol. 111, no. 1-2, pp. 5–32, 2008.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
D. P. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods, Academic Press, New York, NY, USA, 1982.
E. G. Birgin, C. A. Floudas, and J. M. Martínez, “Global minimization using an augmented Lagrangian method with variable lower-level constraints,” Mathematical Programming, vol. 125, no. 1, pp. 139–162, 2010.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
A. R. Conn, N. I. M. Gould, and P. L. Toint, “A globally convergent augmented Lagrangian algorithm for optimization with general constraints and simple bounds,” SIAM Journal on Numerical Analysis, vol. 28, no. 2, pp. 545–572, 1991.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
A. R. Conn, N. Gould, A. Sartenaer, and Ph. L. Toint, “Convergence properties of an augmented Lagrangian algorithm for optimization with a combination of general equality and linear constraints,” SIAM Journal on Optimization, vol. 6, no. 3, pp. 674–703, 1996.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
R. M. Lewis and V. Torczon, “A globally convergent augmented Lagrangian pattern search algorithm for optimization with general constraints and simple bounds,” SIAM Journal on Optimization, vol. 12, no. 4, pp. 1075–1089, 2002.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
A. Rubinov and X. Yang, Lagrange-Type Functions in Constrained Non-Convex Optimization, vol. 85, Kluwer Academic, Dordrecht, The Netherlands, 2003.
H. Z. Luo, X. L. Sun, and D. Li, “On the convergence of augmented Lagrangian methods for constrained global optimization,” SIAM Journal on Optimization, vol. 18, no. 4, pp. 1209–1230, 2007.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
H. Z. Luo, X. L. Sun, and Y. F. Xu, “Convergence properties of modified and partially-augmented Lagrangian methods for mathematical programs with complementarity constraints,” Journal of Optimization Theory and Applications, vol. 145, no. 3, pp. 489–506, 2010.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
M. R. Hestenes, “Multiplier and gradient methods,” Journal of Optimization Theory and Applications, vol. 4, pp. 303–320, 1969.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
M. J. D. Powell, “A Method for Nonlinear Constraints in Minimization Problems,” in Optimization, R. Fletcher, Ed., pp. 283–298, Academic Press, New York, NY, USA, 1969.
View at: Google Scholar | Zentralblatt MATH
R. T. Rockafellar, “Augmented Lagrange multiplier functions and duality in nonconvex programming,” SIAM Journal on Control and Optimization, vol. 12, pp. 268–285, 1974.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
X. L. Sun, D. Li, and K. I. M. McKinnon, “On saddle points of augmented Lagrangians for constrained nonconvex optimization,” SIAM Journal on Optimization, vol. 15, no. 4, pp. 1128–1146, 2005.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
C.-Y. Wang and D. Li, “Unified theory of augmented Lagrangian methods for constrained global optimization,” Journal of Global Optimization, vol. 44, no. 3, pp. 433–458, 2009.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
R. T. Rockafellar and R. J.-B. Wets, Variational Analysis, vol. 317, Springer, Berlin, Germany, 1998.
View at: Publisher Site
D. P. Bertsekas and A. E. Ozdaglar, “Pseudonormality and a Lagrange multiplier theory for constrained optimization,” Journal of Optimization Theory and Applications, vol. 114, no. 2, pp. 287–343, 2002.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
J. F. Bonnans and A. Shapiro, Perturbation Analysis of Optimization Problems, Springer, New York, NY, USA, 2000.
K. L. Teo, X. Q. Yang, and L. S. Jennings, “Computational discretization algorithms for functional inequality constrained optimization,” Annals of Operations Research, vol. 98, pp. 215–234, 2000.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
R. Y. K. Fung, J. Tang, and D. Wang, “Extension of a hybrid genetic algorithm for nonlinear programming problems with equality and inequality constraints,” Computers & Operations Research, vol. 29, no. 3, pp. 261–274, 2002.
View at: Publisher Site | Google Scholar | Zentralblatt MATH

Copyright

Copyright © 2012 Xunzhi Zhu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1514

Downloads

1001

Citations