Accelerated Proximal Algorithms with a Correction Term for Monotone Inclusions

Maingé, Paul-Emile

doi:10.1007/s00245-021-09819-y

Accelerated Proximal Algorithms with a Correction Term for Monotone Inclusions

Published: 01 September 2021

Volume 84, pages 2027–2061, (2021)
Cite this article

Applied Mathematics & Optimization Aims and scope Submit manuscript

Paul-Emile Maingé ORCID: orcid.org/0000-0001-8146-1375¹

681 Accesses
9 Citations
Explore all metrics

Abstract

In this paper, we propose an accelerated variant of the proximal point algorithm for computing a zero of an arbitrary maximally monotone operator A. The method incorporates an inertial term (based upon the acceleration techniques introduced by Nesterov), relaxation factors and a correction term. In a general Hilbert space setting, we obtain the weak convergence of the iterates $(x_n)$ to equilibria, with the fast rate $\Vert x_{n+1}-x_{n}\Vert =o(n^{-1})$ (for the discrete velocity). In particular, when using constant proximal indexes, we establish the accuracy of $(x_n)$ to a zero of A with the worst-case rate $\Vert A_{\lambda }(x_n)\Vert =o(n^{-1})$ ($A_{\lambda }$ being the Yosida regularization of A with some index $\lambda $). Furthermore, given any positive integer p and using an appropriate adjustment of increasing proximal indexes, we obtain the worst-case convergence rate $\Vert A_{\lambda }(x_n)\Vert =o(n^{-(p+1)})$. This acceleration process can be applied to various proximal-type algorithms such as the augmented Lagrangian method, the alternating direction method of multipliers, operator splitting methods and so on. Our methodology relies on a Lyapunov analysis combined with a suitable reformulation of the considered algorithm. Numerical experiments are also performed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Regularization Proximal Method for Monotone Variational Inclusions

Article 15 October 2021

Convergence of a relaxed inertial proximal algorithm for maximally monotone operators

Article 29 June 2019

Proximal alternating penalty algorithms for nonsmooth constrained convex optimization

Article 27 September 2018

References

Attouch, H.: Fast inertial proximal ADMM algorithms for convex structured optimization with linear constraint. Minimax Theory Its Appl. 06(1), 1–24 (2021)
MathSciNet MATH Google Scholar
Attouch, H., László, S.C.: Newton-like inertial dynamics and proximal algorithms governed by maximally monotone operators. SIAM J. Optim. 30(4), 3252–3283 (2020)
Article MathSciNet Google Scholar
Attouch, H., László, S.C.: Continuous Newton-like inertial dynamics for monotone inclusions. Set-Valued Var. Anal. (2020). https://doi.org/10.1007/s11228-020-00564-y
Article MATH Google Scholar
Attouch, H., Chbani, Z., Fadili, J., Riahi, H.: First-order optimization algorithms via inertial systems with Hessian driven damping, arXiv preprint, arXiv:1907.10536 (2019)
Attouch, H., Peypouquet, J.: Convergence of inertial dynamics and proximal algorithms governed by maximal monotone operators. Math. Programm. 174, 391–432 (2019)
Article Google Scholar
Attouch, H., Peypouquet, J.: Convergence Rate of Proximal Inertial Algorithms Associated with Moreau Envelopes of Convex Functions. In: Bauschke, H., Burachik, R., Luke, D. (eds.) Splitting Algorithms, Modern Operator Theory, and Applications. Springer, Cham (2019)
MATH Google Scholar
Attouch, H., Soueycatt, M.: Augmented Lagrangian and proximal alternating direction methods of multipliers in Hilbert spaces. Applications to games, PDE’s and control. Pac. J. Optim. 5(1), 17–37 (2009)
MathSciNet MATH Google Scholar
Attouch, H., Svaiter, B.F.: A continuous dynamical Newton-like approach to solving monotone inclusions. SIAM J. Control Optim. 49(2), 574–598 (2011)
Article MathSciNet Google Scholar
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, New York (2017)
Book Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Article MathSciNet Google Scholar
Boţ, R.I., Csetnek, E.R.: Second order forward-backward dynamical systems for monotone inclusion problems. SIAM J. Control Optim. 54, 1423–1443 (2016)
Article MathSciNet Google Scholar
Boţ, R.I., Csetnek, E.R.: ADMM for monotone operators: convergence analysis and rates. Adv. Comput. Math. 45(1), 327–359 (2019)
Article MathSciNet Google Scholar
Brezis, H.: Opérateurs maximaux monotones, Mathematical Studies, vol. 5. North-Holland, Amsterdam (1973)
MATH Google Scholar
Brezis, H.: Function Analysis, Sobolev Spaces and Partial Differential Equations. Springer, New York (2010). https://doi.org/10.1007/978-0-387-70914-7
Book Google Scholar
Brezis, H., Lions, P.L.: Produits infinis de résolvantes. Isr. J. Math. 29, 329–345 (1978)
Article Google Scholar
Chambolle, A., Dossal, C.: On the convergence of the iterates of Fista. J. Optim. Theory Appl. 166(3), 968–982 (2015)
Article MathSciNet Google Scholar
Combettes, P.L.: Monotone operator theory in convex optimization. Math. Programm. Vol. B 170(1), 177–206 (2018)
Article MathSciNet Google Scholar
Corman, E., Yuan, X.: A generalized proximal point algorithm and its convergence rate. SIAM J. Optim. 24(4), 1614–1638 (2014)
Article MathSciNet Google Scholar
Douglas, J., Rachford, H.H.: On the numerical solution of heat conduction problems in two and three space variables. Trans. Am. Math. Soc. 82, 421–439 (1956)
Article MathSciNet Google Scholar
Drori, Y., Teboulle, M.: Performance of first-order methods for smooth convex minimization: a novel approach. Math. Programm. 145(1–2), 451–482 (2014)
Article MathSciNet Google Scholar
Eckstein, J., Bertsekas, D.P.: On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Programm. 55(1–3), 293–318 (1992)
Article MathSciNet Google Scholar
Gabay, D., Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite-element approximations. Comput. Math. Appl. 2(1), 17–40 (1976)
Article Google Scholar
Gu, G., Yang, J.: Optimal nonergodic sublinear convergence rate of proximal point algorithm for maximal monotone inclusion problems, (2019), arXiv:1904.05495
Güler, O.: On the convergence of the proximal point algorithm for convex minimization. SIAM J. Control Optim. 29, 403–419 (1991)
Article MathSciNet Google Scholar
Güler, O.: New proximal point algorithms for convex minimization. SIAM J. Optim. 2(4), 649–664 (1992)
Article MathSciNet Google Scholar
Hestenes, M.R.: Multiplier and gradient methods. J. Optim. Theory Appl. 4, 302–320 (1969)
MathSciNet MATH Google Scholar
Kim, D.: Accelerated proximal point method for maximally monotone operators, (2019), arXiv:1905.05149
Kim, D., Fessler, J.A.: Another look at the Fast Iterative Shrinkage/Thresholding Algorithm (FISTA). SIAM J. Optim. 28(1), 223–250 (2018)
Article MathSciNet Google Scholar
Kim, D., Fessler, J.A.: Generalizing the optimized gradient method for smooth convex minimization. SIAM J. Optim. 28(2), 1920–1950 (2018)
Article MathSciNet Google Scholar
Lemaire, B.: The proximal algorithm, in: New methods in optimization and their industrial uses. In: J.P. Penot (ed), Internat. Ser. Numer. Math. 87, Birkhauser, Basel, pp. 73-87 (1989)
Lin, H., Mairal, J., Harchaoui, Z.: Catalyst acceleration for first-order convex optimization: from theory to practice. J. Mach. Learning Res. 18(212), 1–54 (2018)
MathSciNet MATH Google Scholar
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964–979 (1979)
Article MathSciNet Google Scholar
Lorenz, D.A., Pock, T.: An inertial forward-backward algorithm for monotone inclusions. J. Math. Imaging Vis. 51, 311–325 (2015)
Article MathSciNet Google Scholar
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)
Article MathSciNet Google Scholar
Maingé, P.E.: First-order continuous newton-like systems for monotone inclusions. SIAM J. Control Optim. 51(2), 1615–1638 (2013)
Article MathSciNet Google Scholar
Martinet, B.: Régularisation d’in’ equations variationnelles par approximations successives. Rev. Fr. Infor. Rech. Opération. 4, 154–158 (1970)
MATH Google Scholar
Nesterov, Y.: A method of solving a convex programming problem with convergence rate O(1/k2). Soviet Math. Doklady 27, 372–376 (1983)
MATH Google Scholar
Nesterov, Y.: A method for unconstrained convex minimization problem with the rate of convergence O(1/k2). Dokl. Akad. Nauk. USSR 269(3), 543–7 (1983)
Google Scholar
Nesterov, Y.: Gradient methods for minimizing composite objective function. Math. Programm. Ser. B 140, 125–161 (2013)
Article Google Scholar
Peaceman, D.W., Rachford, H.H.: The numerical solution of parabolic and elliptic differential equations. J. Soc. Ind. Appl. Math. 3(1), 28–41 (1955)
Article MathSciNet Google Scholar
Powell, M.J.D.: Optimization. In: Fletcher, R. (ed.) A Method for Nonlinear Constraints in Minimization Problems, pp. 283–98. Academic Press, New York (1969)
Google Scholar
Rockafellar, R.T.: Monotone operators associated with saddle functions and minimax problems. In: F. E. Browder, (ed), Nonlinear Functional Analysis, Part 1. Symposia in Pure Math., vol. 18, American Mathematical Society, Providence, RI., pp. 397–407 (1970)
Rockafellar, R.T.: Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Math. Oper. Res. 1(2), 97–116 (1976)
Article MathSciNet Google Scholar
Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control. Opt. 14(5), 877–898 (1976)
Article MathSciNet Google Scholar
Rockafellar, R.T., Wets, J.B.: Variational Analysis. Springer, Berlin (1998)
Book Google Scholar
Takahashi, W.: Nonlinear Functional Analysis. Yokohama Publishers, Yokohama (2000)
MATH Google Scholar
Zhang, X.Q., Burger, M., Bresson, X., Osher, S.: Bregmanized nonlocal regularization for deconvolution and sparse reconstruction. SIAM J. Imaging Sci. 3(3), 253–276 (2010)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

F.W.I., MEMIAD, Université des Antilles, Campus de Schoelcher, 97233, Schoelcher Cedex, Martinique
Paul-Emile Maingé

Authors

Paul-Emile Maingé
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paul-Emile Maingé.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

APPENDIX.

1.1 Proof of Proposition 1.

In order to get this result, we compute the discrete derivative ${\dot{G}}_{n+1}(s,q):={G}_{n+1}(s,q)-{G}_{n}(s,q)$. For convenience of the reader we recall that $G_n(s,q)$ is given from (2.10) by

$$\begin{aligned} G_n(s,q) = \frac{1}{2} \Vert s (q-x_{n})- \nu _n {\dot{x}}_{n}\Vert ^2+ \frac{1}{2} s(e -s) \Vert x_{n}-q\Vert ^2. \end{aligned}$$

(A.1)

It is then readily noticed that ${\dot{G}}_{n+1}(s,q)$ can be formulated as

$$\begin{aligned} \begin{array}{l} {\dot{G}}_{n+1}(s,q) = s (\nu _{n+1} {a}_{n+1}- \nu _{n} a_n) + s e {\dot{b}}_{n+1} + \nu ^2_{n+1}{c}_{n+1}- \nu _{n}^2{c}_{n} , \end{array} \end{aligned}$$

(A.2)

where

$$\begin{aligned} \hbox {} a_n:=\langle x_n-q, {\dot{x}}_{n}\rangle \hbox {,}\,\, b_n := \frac{1}{2} \Vert x_n-q\Vert ^2\hbox { and }c_n :=\frac{1}{2} \Vert {\dot{x}}_{n}\Vert ^2. \end{aligned}$$

Note also that for any bilinear symmetric form $\langle .,.\rangle _E$ on a real vector space E and for any sequences $\{ \phi _{n} , \varphi _{n} \} \subset E$ we have the discrete derivative rules:

$$\begin{aligned}&\langle \phi _{n+1} , \varphi _{n+1} \rangle _E- \langle \phi _{n} , \varphi _{n} \rangle _E=\langle {\dot{\phi }}_{n+1}, \varphi _{n+1} \rangle _E + \langle \phi _{n} , {\dot{\varphi }}_{n+1} \rangle _E, \end{aligned}$$

(A.3a)

$$\begin{aligned}&\langle \phi _{n+1} , \varphi _{n+1} \rangle _E - \langle \phi _{n} , \varphi _{n} \rangle _E=\langle {\dot{\phi }}_{n+1}, \varphi _{n} \rangle _E + \langle \phi _{n+1} , {\dot{\varphi }}_{n+1} \rangle _E. \end{aligned}$$

(A.3b)

The sequel of the proof can be divided into the following parts (1)–(4):

(1) Basic estimates. Setting $P_n= \langle q-x_{n+1}, {\dot{x}}_{n+1}\rangle $, we establish the elementary but useful facts below:

$$\begin{aligned} {\dot{b}}_{n+1}= & {} -P_n - \hbox { }\ \frac{1}{2} \Vert {\dot{x}}_{n+1}\Vert ^2, \end{aligned}$$

(A.4a)

$$\begin{aligned} \nu _{n+1} a_{n+1} -\nu _n a_n= & {} \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle + e P_n - (e+ \nu _{n+1}) \langle d_{n}, x_{n+1}-q \rangle .\nonumber \\ \end{aligned}$$

(A.4b)

Let us prove (A.4a). From $ 2 b_{n+1}= \Vert x_{n+1}-q\Vert ^2$, by the derivative rule (A.3a) we get
$$\begin{aligned} \begin{array}{l} 2 {\dot{b}}_{n+1} =\langle {\dot{x}}_{n+1}, x_{n+1}-q \rangle + \langle x_{n}-q, {\dot{x}}_{n+1}\rangle \\ =\langle {\dot{x}}_{n+1}, x_{n+1}-q \rangle + \langle x_n -x_{n+1}, {\dot{x}}_{n+1}\rangle + \langle x_{n+1}-q, {\dot{x}}_{n+1}\rangle , \end{array} \end{aligned}$$
namely $ 2 {\dot{b}}_{n+1} = -2 P_n - \Vert {\dot{x}}_{n+1}\Vert ^2$, which leads to (A.4a).
Let us prove (A.4b). From $a_n= \langle q-x_n, - {\dot{x}}_{n}\rangle $, we simply get
$$\begin{aligned} \begin{array}{l} a_n=\langle q-x_{n+1}, -{\dot{x}}_{n}\rangle + \langle {\dot{x}}_{n+1}, -{\dot{x}}_{n}\rangle =- \langle q-x_{n+1}, {\dot{x}}_{n}\rangle - \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle . \end{array}\nonumber \\ \end{aligned}$$
(A.5)
Moreover, from $a_n:=\langle x_n-q, {\dot{x}}_{n}\rangle $, by the rule (A.3b) we readily get
$$\begin{aligned} {\dot{a}}_{n+1} = \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle + \langle x_{n+1}-q , {\dot{x}}_{n+1}-{\dot{x}}_{n}\rangle . \end{aligned}$$
(A.6)
In addition, the derivative rule (A.3b) yields
$$\begin{aligned} \nu _{n+1} a_{n+1} -\nu _n a_n= {\dot{\nu }}_{n+1} a_{n} + {\nu }_{n+1}{\dot{a}}_{n+1}. \end{aligned}$$
This latter result, in light of (A.5) and (A.6), amounts to
$$\begin{aligned} \begin{array}{l} a_{n+1}\nu _{n+1} - a_n \nu _n \\ \quad = {\dot{\nu }}_{n+1} \bigg (- \langle q-x_{n+1}, {\dot{x}}_{n}\rangle - \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle \bigg )\\ \qquad + {\nu }_{n+1}\bigg ( \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle - P_n + \langle q-x_{n+1}, {\dot{x}}_{n}\rangle \bigg ) \\ \quad = \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle - \nu _{n+1} P_n + \nu _{n} \langle q-x_{n+1}, {\dot{x}}_{n}\rangle . \end{array} \end{aligned}$$
(A.7)
Furthermore, (2.9) gives us $ {\dot{x}}_{n+1}+ d_{n} - \theta _n {\dot{x}}_{n}=0$. Taking the scalar product of each side of this equality by $q-x_{n+1}$ yields
$$\begin{aligned} \hbox { }\ \theta _n\langle q-x_{n+1}, {\dot{x}}_{n}\rangle = P_n - \langle d_{n}, x_{n+1}-q \rangle . \end{aligned}$$
So, recalling that $ \nu _{n} =(e+\nu _{n+1}) \theta _n$ (from (2.9)), we get
$$\begin{aligned} \nu _{n} \langle q-x_{n+1}, {\dot{x}}_{n}\rangle&=(e+\nu _{n+1}) (\theta _n\langle q-x_{n+1}, {\dot{x}}_{n}\rangle )\\&=(e+\nu _{n+1}) \left( P_n - \langle d_{n}, x_{n+1}-q \rangle \right) , \end{aligned}$$
which, in light of (A.7), entails
$$\begin{aligned} \begin{array}{l} a_{n+1}\nu _{n+1} - a_n \nu _n \\ = \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle - \nu _{n+1} P_n+ (e+\nu _{n+1}) \left( P_n - \langle d_{n}, x_{n+1}-q \rangle \right) \\ = \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle + e P_n - (e+\nu _{n+1})\langle d_{n}, x_{n+1}-q \rangle , \end{array} \end{aligned}$$
that is (A.4b).

(2) An estimate from the inertial part. Now, given $(s,q) \in [0,\infty ) \times {\mathcal {H}}$, we prove that the discrete derivative $ {\dot{G}}_{n+1}(s,q)$ satisfies

$$\begin{aligned} \begin{array}{l} {\dot{G}}_{n+1}(s,q) + s (e+ \nu _{n+1}) \langle d_{n}, x_{n+1}-q \rangle \\ = s \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle - \frac{1}{2} \left( s e - \nu _{n+1}^2 \right) \Vert {\dot{x}}_{n+1}\Vert ^2 - \frac{1}{2} \nu _n ^2 \Vert {\dot{x}}_{n}\Vert ^2. \end{array} \end{aligned}$$

(A.8)

Indeed, in light of (A.2) and (A.4), we obtain

$$\begin{aligned} \begin{array}{l} {\dot{G}}_{n+1} = s \bigg ( \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle +e P_n - (e+ \nu _{n+1}) \langle d_{n}, x_{n+1}-q \rangle \bigg ) \\ \qquad + se \bigg ( -P_n - \frac{1}{2}\Vert {\dot{x}}_{n+1}\Vert ^2 \bigg ) + \frac{1}{2} \bigg ( \nu _{n+1}^2 \Vert {\dot{x}}_{n+1}\Vert ^2- \nu _{n}^2 \Vert {\dot{x}}_{n}\Vert ^2 \bigg ) \\ \quad = s \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle + \frac{1}{2} \left( \nu _{n+1}^2 - se \right) \Vert {\dot{x}}_{n+1}\Vert ^2 - \frac{1}{2} \nu _n ^2 \Vert {\dot{x}}_{n}\Vert ^2\\ \qquad -s (e+ \nu _{n+1}) \langle d_{n}, x_{n+1}-q \rangle , \end{array} \end{aligned}$$

which leads obviously to the desired equality.

(3) An estimate from the proximal part. We prove that, for any $\xi _n \ne 1$, it holds that

$$\begin{aligned} \begin{array}{l} \xi _n \langle d_n,{\dot{x}}_{n+1}\rangle + \frac{ 1 }{2} \Vert {\dot{x}}_{n+1}- \theta _n{\dot{x}}_n \Vert ^2 \\ = - \theta _n(1-\xi _n)\langle {\dot{x}}_n,{\dot{x}}_{n+1}\rangle + \frac{1}{2} \theta _n^2 \Vert {\dot{x}}_n \Vert ^2 - \left( \xi _n- \frac{1}{2} \right) \Vert {\dot{x}}_{n+1}\Vert ^2. \end{array} \end{aligned}$$

(A.9)

Indeed, we have ${\dot{x}}_{n+1}=\theta _n{\dot{x}}_n -d_n$ (from (2.9)), hence, for any $\xi _n \ne 1$, and setting $H_n= {\dot{x}}_{n+1}- (1-\xi _n)^{-1}\theta _n{\dot{x}}_n $, we have

$$\begin{aligned} (1-\xi _n)H_n= (1-\xi _n){\dot{x}}_{n+1}- \theta _n{\dot{x}}_n = -d_n- \xi _n {\dot{x}}_{n+1}, \end{aligned}$$

or equivalently

$$\begin{aligned} \begin{array}{l} \xi _n {\dot{x}}_{n+1}= -(1-\xi _n)H_n - d_n . \end{array} \end{aligned}$$

(A.10)

Furthermore, by $-d_n= {\dot{x}}_{n+1}- \theta _n{\dot{x}}_n$ (again using (2.9)) we simply obtain

$$\begin{aligned} \begin{array}{l} \langle (-d_n), H_n \rangle = \langle {\dot{x}}_{n+1}- \theta _n{\dot{x}}_{n}, {\dot{x}}_{n+1}- (1-\xi _n)^{-1}\theta _n{\dot{x}}_{n}\rangle \\ = \Vert {\dot{x}}_{n+1}\Vert ^2 + (1-\xi _n)^{-1}\theta _n^2 \Vert {\dot{x}}_{n}\Vert ^2 - \frac{2-\xi _n}{(1-\xi _n)}\theta _n\langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle . \end{array} \end{aligned}$$

(A.11)

Therefore, taking the scalar product of each side of (A.10) with $d_n$, also adding $(1/2) \Vert d_n\Vert ^2$ to the resulting equality, and next using (A.11) we get

$$\begin{aligned} \begin{array}{l} \xi _n \langle d_n , {\dot{x}}_{n+1}\rangle + \frac{1}{2} \Vert d_n\Vert ^2 = (1-\xi _n)\langle (-d_n), H_n \rangle - \frac{1}{2} \Vert d_n\Vert ^2 \\ = (1-\xi _n)\left( \Vert {\dot{x}}_{n+1}\Vert ^2 + \frac{ \theta _n^2 }{(1-\xi _n)} \Vert {\dot{x}}_{n}\Vert ^2 - \frac{2-\xi _n}{(1-\xi _n)} \theta _n\langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle \right) \\ \;\;- \frac{1}{2}\left( \Vert {\dot{x}}_{n+1}\Vert ^2 + \theta _n^2 \Vert {\dot{x}}_{n}\Vert ^2 - 2 \theta _n\langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle \right) \\ = -(1-\xi _n) \theta _n\langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle + \frac{1}{2}\theta _n^2 \Vert {\dot{x}}_{n}\Vert ^2 + \left( \frac{1}{2} -\xi _n \right) \Vert {\dot{x}}_{n+1}\Vert ^2 . \end{array} \end{aligned}$$

This, while noticing that $ \Vert d_n\Vert ^2= \Vert {\dot{x}}_{n+1}- \theta _n{\dot{x}}_{n}\Vert ^2$ (from (2.9)) , yields (A.9).

(4) Combining proximal and inertial effects. At once we show for $s \in \left( 0, e \right] $ that the iterates verify (for $n \ge 0$)

$$\begin{aligned} \begin{array}{l} {\dot{G}}_{n+1}(s,q) + \frac{1}{2} (e+\nu _{n+1})^2 \Vert {\dot{x}}_{n+1}- \theta _n{\dot{x}}_{n}\Vert ^2 \\ + s (e+\nu _{n+1}) \langle d_n, x_{n+1}-q \rangle + (e+\nu _{n+1}) (e-s+\nu _{n+1}) \langle d_n,{\dot{x}}_{n+1}\rangle \\ = -\frac{1}{2} \left( e -s \right) \left( e + 2 \nu _{n+1}\right) \Vert {\dot{x}}_{n+1}\Vert ^2 . \end{array} \end{aligned}$$

(A.12)

Indeed, denoting $\tau _n=e+\nu _{n+1}$, from (A.8) we know that

$$\begin{aligned} \begin{array}{l} {\dot{G}}_{n+1}(s,q) + s \tau _n \langle d_n, x_{n+1}-q \rangle \\ = s \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle - \frac{1}{2} \left( s e - \nu _{n+1}^2 \right) \Vert {\dot{x}}_{n+1}\Vert ^2 - \frac{1}{2} \nu _n ^2 \Vert {\dot{x}}_{n}\Vert ^2. \end{array} \end{aligned}$$

(A.13)

Furthermore, assuming that $s \in \left( 0, e \right] $ and taking $\xi _n=1- s \tau _n^{-1} $ in (A.9), while noticing that $\tau _n \theta _n=\nu _n $ (from (2.9b)), we obtain

$$\begin{aligned} \begin{array}{l} (1- s \tau _n^{-1} ) \langle d_n ,{\dot{x}}_{n+1}\rangle + \frac{ 1 }{2} \Vert {\dot{x}}_{n+1}- \theta _n{\dot{x}}_{n}\Vert ^2 \\ = -s\theta _n\tau _n^{-1} \langle {\dot{x}}_{n}, {\dot{x}}_{n+1}\rangle + \frac{1}{2} \theta _n^2 \Vert {\dot{x}}_{n}\Vert ^2 - \left( \frac{1}{2} -s \tau _n^{-1} \right) \Vert {\dot{x}}_{n+1}\Vert ^2 \\ = -s\nu _n \tau _n^{-2} \langle {\dot{x}}_{n}, {\dot{x}}_{n+1}\rangle + \frac{1}{2} \nu _n^2 \tau _n^{-2} \Vert {\dot{x}}_{n}\Vert ^2 - \left( \frac{1}{2} -s \tau _n^{-1} \right) \Vert {\dot{x}}_{n+1}\Vert ^2 . \end{array} \end{aligned}$$

(A.14)

Then multiplying equality (A.14) by $\tau _n^2$ and adding the resulting equality to (A.13) yields

$$\begin{aligned} \begin{array}{l} (1- s \tau _n^{-1} ) \tau _n^2 \langle d_n ,{\dot{x}}_{n+1}\rangle + \frac{1}{2} \tau _n^2 \Vert {\dot{x}}_{n+1}- \theta _n{\dot{x}}_{n}\Vert ^2 \\ \;\;+\; {\dot{G}}_{n+1}(s,q) + s \tau _n \langle d_n , x_{n+1}-q \rangle \\ = \bigg ( - \frac{1}{2} \left( s e - \nu _{n+1}^2 \right) - \tau _n^2 \left( \frac{1}{2} -s \tau _n^{-1} \right) \bigg ) \Vert {\dot{x}}_{n+1}\Vert ^2 = -T_n, \end{array} \end{aligned}$$

(A.15)

where $T_n$ is defined by

$$\begin{aligned} T_n = \frac{1}{2} \bigg ( se - \nu _{n+1}^2 + \tau _n^2 \left( 1- 2s\tau _n ^{-1} \right) \bigg ) \Vert {\dot{x}}_{n+1}\Vert ^2. \end{aligned}$$

(A.16)

In addition, as $\tau _n:=e+\nu _{n+1}$, a simple computation yields

$$\begin{aligned} \begin{array}{l} \tau _n^2 \left( 1 -2 s \tau _n^{-1} \right) = e^2+ 2 e \nu _{n+1} + ( \nu _{n+1})^2 - 2s \left( e+ \nu _{n+1} \right) \\ = e \left( e -s \right) - s e + 2 \nu _{n+1} \left( e -s \right) + ( \nu _{n+1} )^2 \\ = \left( e + 2 \nu _{n+1}\right) \left( e -s \right) - s e + ( \nu _{n+1})^2. \end{array} \end{aligned}$$

As a consequence, by (A.16) we obtain

$$\begin{aligned} T_{n}= \frac{\left( e -s \right) }{2} \left( e + 2 \nu _{n+1}\right) \Vert {\dot{x}}_{n+1}\Vert ^2. \end{aligned}$$

This ends the proof $\square $

1.2 Proof of Proposition 2.

Let us first prove the result for $b=1$. Consider the positive sequence $(k_n)_{n \ge 0}$ defined, for $n \ge 1$ and for some constants $\{a, c \} \subset [0,\infty )$, by the recursive formula $\frac{k_{n}}{k_{n-1}}= 1 + \frac{a}{ n + c } $, . As a consequence, from the basic inequalities $a \ge [a]$ and $c \le [c]+1$, we get

$$\begin{aligned} \begin{array}{l} \frac{k_{n}}{k_{n-1}} \ge 1 + \frac{[a]}{ n + [c]+1 } = \frac{ n + [c]+[a]+1}{ n + [c]+1 }\end{array}. \end{aligned}$$

Hence, for $n \ge 1$, we obtain

$$\begin{aligned} \begin{array}{l} \frac{k_n}{k_0} \ge \prod _{j=1}^n \frac{ j + [c]+[a]+1}{ j + [c]+1 } = \frac{ \prod _{j=1}^{n} (j + [c]+[a]+1)}{ \prod _{j=1}^{n} (j + [c]+1) } = \frac{ \prod _{j=2 + [c]+[a]}^{n + [c]+1+[a]} j }{ \prod _{j=2+ [c]}^{n+ [c]+1} j }. \end{array} \end{aligned}$$

(A.17)

Moreover, for $n \ge 3 + [a]$, we simply get

$$\begin{aligned}&\hbox { }\ \prod _{j=2 + [c]+[a]}^{n + [c]+1+[a]} j = (2 + [c]+[a]) \bigg ( \prod _{j=3 + [c]+[a]}^{n + [c]} j \bigg ) \bigg (\prod _{j=n+ [c]+1 }^{n + [c]+1+[a]} j \bigg ), \end{aligned}$$

(A.18a)

$$\begin{aligned}&\prod _{j=2+ [c]}^{n+ [c]+1} j = \bigg ( \prod _{j=2+ [c]}^{ 2 + [c]+[a] } j \bigg ) \bigg ( \prod _{j=3+ [c]+[a]}^{ n+ [c] } j \bigg ) (n+ [c]+1). \end{aligned}$$

(A.18b)

As a consequence, for $n \ge 3 + [a]$, by (A.17) in light of (A.18) we get

$$\begin{aligned} \begin{array}{l} k_n \ge k_0 \frac{ (2 + [c]+[a]) \big (\prod _{j=n+ [c]+1 }^{n + [c]+1+[a]} j \big ) }{ \big ( \prod _{j=2+ [c]}^{ 2 + [c]+[a] } j \big ) (n+ [c]+1) }= C_0 \frac{ \prod _{j=n+ [c]+1 }^{n + [c]+1+[a]} j }{ n+ [c]+1 }= C_0 \frac{ \prod _{j= [c]+1 }^{ [c]+1+[a]} (n+j) }{ n+ [c]+1 } , \end{array} \end{aligned}$$

where $C_0:= k_0 \bigg ( \frac{ 2 + [c]+[a] }{ \prod _{j=2+ [c]}^{ 2 + [c]+[a] } j } \bigg )$, which implies that

$$\begin{aligned} \begin{array}{l} k_n \ge C_0 (n+ [c]+1)^{[a]} \ge C_0 n^{[a]} . \end{array} \end{aligned}$$

The desired result follows immediately from the previous arguments when replacing a and c by $\frac{a}{b}$ and $\frac{c}{b}$, respectively. This completes the proof $\square $

1.3 Proof of Theorem 1

The proof will be divided into the following steps (a), (b) and (c):

(a) In order to establish (3.29a) and (3.29b), we first prove the following estimates

$$\begin{aligned}&\hbox { }\ \Vert {\dot{x}}_{n}\Vert =o(n^{-1}), \end{aligned}$$

(A.19a)

$$\begin{aligned}&\hbox { }\ k_{n-1}\Vert A_{\lambda }(x_{n})\Vert =o(n^{-1}). \end{aligned}$$

(A.19b)

Indeed, passing to the limit as $s \rightarrow 0^+$ in (2.8) amounts to

$$\begin{aligned} \begin{aligned}&(b (n+1) +{\bar{c}}-a_1)^2 \Vert {\dot{x}}_{n+1}\Vert ^2 - ( b n +{\bar{c}}-a_1)^2 \Vert {\dot{x}}_{n}\Vert ^2 \\&+\; \lambda k_{n-1} (b n+{\bar{c}}) \left( a \frac{bn+{\bar{c}}}{bn+c}+ a_2 \right) \langle A_{\lambda }(x_{n}) , {\dot{x}}_{n+1}\rangle \\&+\; \frac{1}{2} (a_1-b)\big ( (2n+1)b +2{\bar{c}}-a_1 \big ) \Vert {\dot{x}}_{n+1}\Vert ^2 \le 0. \end{aligned} \end{aligned}$$

(A.20)

Concerning the second quantity in the right side of the above inequality, by $b>0$, $a \ge 0$ and $a_2 >0$ (hence $a+a_2 >0$) we readily have

$$\begin{aligned} (bn+{\bar{c}}) \left( a \frac{bn+{\bar{c}}}{bn+c}+ a_2 \right) \sim b (a+a_2)n \hbox { as }n \rightarrow \infty . \end{aligned}$$

Then, in light of $\sum _n n k_{n-1} |\langle A_{\lambda }(x_{n}), {\dot{x}}_{n+1}\rangle | < \infty $ (from (3.14d)), we infer that

$$\begin{aligned} \hbox { }\ \sum _{n} (bn+{\bar{c}}) \left( a \frac{bn+{\bar{c}}}{bn+c}+ a_2 \right) k_{n-1}| \langle A_{\lambda }(x_{n}) , {\dot{x}}_{n+1}\rangle | < \infty . \end{aligned}$$

It is then a classical matter to derive from (A.20) that there exists some $ l_2 \ge 0$ such that $ \lim _{n \rightarrow \infty } n ^2 \Vert {\dot{x}}_{n}\Vert ^2=l_2$. Moreover, recalling that $\sum _n n \Vert {\dot{x}}_{n}\Vert ^2 < \infty $ (from (3.3f)), and noticing that $\sum _n n^{-1}=\infty $, we infer that $ \liminf _{n \rightarrow \infty } n ^2 \Vert {\dot{x}}_{n}\Vert ^2=0$. It follows that $l_2=0$, that is (A.19a).

Next combining the latter result with $\Vert {\dot{x}}_{n}+ \lambda k_{n-1} A_{\lambda }(x_{n}) \Vert =o(n^{-1})$ (from (3.14b)) gives us immediately that $ \lim _{n \rightarrow \infty } n k_{n-1} \Vert A_{\lambda }(x_{n})\Vert =0$, that is (A.19b).

Then (3.29a) is given by (3.3f) and (A.19a), while (3.29b) follows from (3.14c) and (A.19b).

(b) The estimates in (3.29c) are derived from (3.3e) and (3.3g), while the estimate (3.29d) is given by (3.3d).

(c) Finally, we prove the weak convergence of the iterates $\{x_{n}, z_n\}$ given by CRIPA by means of the Opial lemma. This latter result guarantees that $(x_n)$ converges weakly to some element of S, provided that the following results (h1)–(h2) hold: (h1) for any $q \in S$, the sequence ($\Vert x_n-q\Vert $) is convergent; (h2) any weak-cluster point of ($x_n$) belongs to S.

- Let us prove (h1). Take $q \in S$. Clearly, as a straightforward consequence of the bounded-ness of $(x_n)$ (given by (3.3a)) along with (A.19b) we have

$$\begin{aligned} k_{n-1}\langle A_{\lambda }(x_{n}), x_{n}-q \rangle =o(n^{-1}). \end{aligned}$$

(1.21)

Moreover, we know that (${{\mathcal {E}}}_n(a+a_2,q)$) is convergent (from Lemma 3) and writes

$$\begin{aligned} \begin{aligned} {{\mathcal {E}}}_n(a+a_2 ,q)&= \frac{1}{2} \Vert (a+a_2) (q-x_{n})- (b n +{\bar{c}}-a_1) {\dot{x}}_{n}\Vert ^2 \\&\quad +\; \frac{1}{2} (a+a_2) \big ( a_1-b-(a+a_2) \big ) \Vert x_{n}-q\Vert ^2 \\&\quad +\; (a+a_2) ( b (n-1)+ {\bar{c}}) ( \lambda k_{n-1}) \langle A_{\lambda }(x_{n}) , x_{n}-q \rangle . \end{aligned} \end{aligned}$$

(1.22)

Then, by $n \Vert {\dot{x}}_{n}\Vert \rightarrow 0$ (from (A.19a)) and $n k_{n-1} | \langle A_{\lambda }(x_{n}), x_{n}-q \rangle | \rightarrow 0$ (according to (1.21)) as $n\rightarrow \infty $, we deduce that

$$\begin{aligned} \lim _{n \rightarrow \infty }{{\mathcal {E}}}_n(a+a_2 ,q)= \lim _{n \rightarrow \infty } \frac{1}{2} (a+a_2 ) (a_1-b) \Vert x_{n}-q\Vert ^2. \end{aligned}$$

(1.23)

This entails (h1).

- Now, we prove (h2). Let u be a weak cluster point of $(x_n)$, namely there exists a subsequence ($x_{n_k}$) that converges to u as $k \rightarrow \infty $. So, in view of (A.19b), we simply have $ \lim _{k \rightarrow +\infty } \Vert A_{\lambda } (x_ {n_k})\Vert =0$, while a classical result gives us $A_{\lambda } (x_{n_k}) \in A (x_{n_k}-\lambda A_{\lambda } (x_{n_k}))$. Then passing to the limit as $k \rightarrow \infty $ in this latter result and recalling that the graph of a maximally monotone operator is demi-closed (see, for instance, [13]), we deduce that $0 \in A(u)$, namely $ u \in S$. This proves (h2).

Then we infer by Opial’s lemma that $(x_n)$ converges weakly to some element ${\bar{x}} \in S$.

Finally, by $\sum _n n \Vert k_{n-1} A_{\lambda } (x_{n})\Vert ^2 < \infty $ (from (3.14c)), we get $k_{n-1} A_{\lambda } (x_{n}) \rightarrow 0$ (as $n \rightarrow \infty $), hence, by $ z_{n-1}-x_{n}=\lambda k_{n-1} A_{\lambda } (x_{n})$ (from (2.4a)), we are led to $ z_{n-1}-x_{n}\rightarrow 0$ (as $n \rightarrow \infty $). It follows that $(z_n)$ converges weakly to the element ${\bar{x}}$. This completes the proof $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maingé, PE. Accelerated Proximal Algorithms with a Correction Term for Monotone Inclusions. Appl Math Optim 84 (Suppl 2), 2027–2061 (2021). https://doi.org/10.1007/s00245-021-09819-y

Download citation

Published: 01 September 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s00245-021-09819-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerated Proximal Algorithms with a Correction Term for Monotone Inclusions

Abstract

Access this article

Similar content being viewed by others

Regularization Proximal Method for Monotone Variational Inclusions

Convergence of a relaxed inertial proximal algorithm for maximally monotone operators

Proximal alternating penalty algorithms for nonsmooth constrained convex optimization

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

APPENDIX.

1.1 Proof of Proposition 1.

1.2 Proof of Proposition 2.

1.3 Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Accelerated Proximal Algorithms with a Correction Term for Monotone Inclusions

Abstract

Access this article

Similar content being viewed by others

Regularization Proximal Method for Monotone Variational Inclusions

Convergence of a relaxed inertial proximal algorithm for maximally monotone operators

Proximal alternating penalty algorithms for nonsmooth constrained convex optimization

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

APPENDIX.

APPENDIX.

1.1 Proof of Proposition 1.

1.2 Proof of Proposition 2.

1.3 Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation