Skip to main content
Log in

Accelerated Proximal Algorithms with a Correction Term for Monotone Inclusions

  • Published:
Applied Mathematics & Optimization Aims and scope Submit manuscript

Abstract

In this paper, we propose an accelerated variant of the proximal point algorithm for computing a zero of an arbitrary maximally monotone operator A. The method incorporates an inertial term (based upon the acceleration techniques introduced by Nesterov), relaxation factors and a correction term. In a general Hilbert space setting, we obtain the weak convergence of the iterates \((x_n)\) to equilibria, with the fast rate \(\Vert x_{n+1}-x_{n}\Vert =o(n^{-1})\) (for the discrete velocity). In particular, when using constant proximal indexes, we establish the accuracy of \((x_n)\) to a zero of A with the worst-case rate \(\Vert A_{\lambda }(x_n)\Vert =o(n^{-1})\) (\(A_{\lambda }\) being the Yosida regularization of A with some index \(\lambda \)). Furthermore, given any positive integer p and using an appropriate adjustment of increasing proximal indexes, we obtain the worst-case convergence rate \(\Vert A_{\lambda }(x_n)\Vert =o(n^{-(p+1)})\). This acceleration process can be applied to various proximal-type algorithms such as the augmented Lagrangian method, the alternating direction method of multipliers, operator splitting methods and so on. Our methodology relies on a Lyapunov analysis combined with a suitable reformulation of the considered algorithm. Numerical experiments are also performed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Attouch, H.: Fast inertial proximal ADMM algorithms for convex structured optimization with linear constraint. Minimax Theory Its Appl. 06(1), 1–24 (2021)

    MathSciNet  MATH  Google Scholar 

  2. Attouch, H., László, S.C.: Newton-like inertial dynamics and proximal algorithms governed by maximally monotone operators. SIAM J. Optim. 30(4), 3252–3283 (2020)

    Article  MathSciNet  Google Scholar 

  3. Attouch, H., László, S.C.: Continuous Newton-like inertial dynamics for monotone inclusions. Set-Valued Var. Anal. (2020). https://doi.org/10.1007/s11228-020-00564-y

    Article  MATH  Google Scholar 

  4. Attouch, H., Chbani, Z., Fadili, J., Riahi, H.: First-order optimization algorithms via inertial systems with Hessian driven damping, arXiv preprint, arXiv:1907.10536 (2019)

  5. Attouch, H., Peypouquet, J.: Convergence of inertial dynamics and proximal algorithms governed by maximal monotone operators. Math. Programm. 174, 391–432 (2019)

    Article  Google Scholar 

  6. Attouch, H., Peypouquet, J.: Convergence Rate of Proximal Inertial Algorithms Associated with Moreau Envelopes of Convex Functions. In: Bauschke, H., Burachik, R., Luke, D. (eds.) Splitting Algorithms, Modern Operator Theory, and Applications. Springer, Cham (2019)

    MATH  Google Scholar 

  7. Attouch, H., Soueycatt, M.: Augmented Lagrangian and proximal alternating direction methods of multipliers in Hilbert spaces. Applications to games, PDE’s and control. Pac. J. Optim. 5(1), 17–37 (2009)

    MathSciNet  MATH  Google Scholar 

  8. Attouch, H., Svaiter, B.F.: A continuous dynamical Newton-like approach to solving monotone inclusions. SIAM J. Control Optim. 49(2), 574–598 (2011)

    Article  MathSciNet  Google Scholar 

  9. Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, New York (2017)

    Book  Google Scholar 

  10. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)

    Article  MathSciNet  Google Scholar 

  11. Boţ, R.I., Csetnek, E.R.: Second order forward-backward dynamical systems for monotone inclusion problems. SIAM J. Control Optim. 54, 1423–1443 (2016)

    Article  MathSciNet  Google Scholar 

  12. Boţ, R.I., Csetnek, E.R.: ADMM for monotone operators: convergence analysis and rates. Adv. Comput. Math. 45(1), 327–359 (2019)

    Article  MathSciNet  Google Scholar 

  13. Brezis, H.: Opérateurs maximaux monotones, Mathematical Studies, vol. 5. North-Holland, Amsterdam (1973)

    MATH  Google Scholar 

  14. Brezis, H.: Function Analysis, Sobolev Spaces and Partial Differential Equations. Springer, New York (2010). https://doi.org/10.1007/978-0-387-70914-7

    Book  Google Scholar 

  15. Brezis, H., Lions, P.L.: Produits infinis de résolvantes. Isr. J. Math. 29, 329–345 (1978)

    Article  Google Scholar 

  16. Chambolle, A., Dossal, C.: On the convergence of the iterates of Fista. J. Optim. Theory Appl. 166(3), 968–982 (2015)

    Article  MathSciNet  Google Scholar 

  17. Combettes, P.L.: Monotone operator theory in convex optimization. Math. Programm. Vol. B 170(1), 177–206 (2018)

    Article  MathSciNet  Google Scholar 

  18. Corman, E., Yuan, X.: A generalized proximal point algorithm and its convergence rate. SIAM J. Optim. 24(4), 1614–1638 (2014)

    Article  MathSciNet  Google Scholar 

  19. Douglas, J., Rachford, H.H.: On the numerical solution of heat conduction problems in two and three space variables. Trans. Am. Math. Soc. 82, 421–439 (1956)

    Article  MathSciNet  Google Scholar 

  20. Drori, Y., Teboulle, M.: Performance of first-order methods for smooth convex minimization: a novel approach. Math. Programm. 145(1–2), 451–482 (2014)

    Article  MathSciNet  Google Scholar 

  21. Eckstein, J., Bertsekas, D.P.: On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Programm. 55(1–3), 293–318 (1992)

    Article  MathSciNet  Google Scholar 

  22. Gabay, D., Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite-element approximations. Comput. Math. Appl. 2(1), 17–40 (1976)

    Article  Google Scholar 

  23. Gu, G., Yang, J.: Optimal nonergodic sublinear convergence rate of proximal point algorithm for maximal monotone inclusion problems, (2019), arXiv:1904.05495

  24. Güler, O.: On the convergence of the proximal point algorithm for convex minimization. SIAM J. Control Optim. 29, 403–419 (1991)

    Article  MathSciNet  Google Scholar 

  25. Güler, O.: New proximal point algorithms for convex minimization. SIAM J. Optim. 2(4), 649–664 (1992)

    Article  MathSciNet  Google Scholar 

  26. Hestenes, M.R.: Multiplier and gradient methods. J. Optim. Theory Appl. 4, 302–320 (1969)

    MathSciNet  MATH  Google Scholar 

  27. Kim, D.: Accelerated proximal point method for maximally monotone operators, (2019), arXiv:1905.05149

  28. Kim, D., Fessler, J.A.: Another look at the Fast Iterative Shrinkage/Thresholding Algorithm (FISTA). SIAM J. Optim. 28(1), 223–250 (2018)

    Article  MathSciNet  Google Scholar 

  29. Kim, D., Fessler, J.A.: Generalizing the optimized gradient method for smooth convex minimization. SIAM J. Optim. 28(2), 1920–1950 (2018)

    Article  MathSciNet  Google Scholar 

  30. Lemaire, B.: The proximal algorithm, in: New methods in optimization and their industrial uses. In: J.P. Penot (ed), Internat. Ser. Numer. Math. 87, Birkhauser, Basel, pp. 73-87 (1989)

  31. Lin, H., Mairal, J., Harchaoui, Z.: Catalyst acceleration for first-order convex optimization: from theory to practice. J. Mach. Learning Res. 18(212), 1–54 (2018)

    MathSciNet  MATH  Google Scholar 

  32. Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964–979 (1979)

    Article  MathSciNet  Google Scholar 

  33. Lorenz, D.A., Pock, T.: An inertial forward-backward algorithm for monotone inclusions. J. Math. Imaging Vis. 51, 311–325 (2015)

    Article  MathSciNet  Google Scholar 

  34. Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)

    Article  MathSciNet  Google Scholar 

  35. Maingé, P.E.: First-order continuous newton-like systems for monotone inclusions. SIAM J. Control Optim. 51(2), 1615–1638 (2013)

    Article  MathSciNet  Google Scholar 

  36. Martinet, B.: Régularisation d’in’ equations variationnelles par approximations successives. Rev. Fr. Infor. Rech. Opération. 4, 154–158 (1970)

    MATH  Google Scholar 

  37. Nesterov, Y.: A method of solving a convex programming problem with convergence rate O(1/k2). Soviet Math. Doklady 27, 372–376 (1983)

    MATH  Google Scholar 

  38. Nesterov, Y.: A method for unconstrained convex minimization problem with the rate of convergence O(1/k2). Dokl. Akad. Nauk. USSR 269(3), 543–7 (1983)

    Google Scholar 

  39. Nesterov, Y.: Gradient methods for minimizing composite objective function. Math. Programm. Ser. B 140, 125–161 (2013)

    Article  Google Scholar 

  40. Peaceman, D.W., Rachford, H.H.: The numerical solution of parabolic and elliptic differential equations. J. Soc. Ind. Appl. Math. 3(1), 28–41 (1955)

    Article  MathSciNet  Google Scholar 

  41. Powell, M.J.D.: Optimization. In: Fletcher, R. (ed.) A Method for Nonlinear Constraints in Minimization Problems, pp. 283–98. Academic Press, New York (1969)

    Google Scholar 

  42. Rockafellar, R.T.: Monotone operators associated with saddle functions and minimax problems. In: F. E. Browder, (ed), Nonlinear Functional Analysis, Part 1. Symposia in Pure Math., vol. 18, American Mathematical Society, Providence, RI., pp. 397–407 (1970)

  43. Rockafellar, R.T.: Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Math. Oper. Res. 1(2), 97–116 (1976)

    Article  MathSciNet  Google Scholar 

  44. Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control. Opt. 14(5), 877–898 (1976)

    Article  MathSciNet  Google Scholar 

  45. Rockafellar, R.T., Wets, J.B.: Variational Analysis. Springer, Berlin (1998)

    Book  Google Scholar 

  46. Takahashi, W.: Nonlinear Functional Analysis. Yokohama Publishers, Yokohama (2000)

    MATH  Google Scholar 

  47. Zhang, X.Q., Burger, M., Bresson, X., Osher, S.: Bregmanized nonlocal regularization for deconvolution and sparse reconstruction. SIAM J. Imaging Sci. 3(3), 253–276 (2010)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul-Emile Maingé.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

APPENDIX.

APPENDIX.

1.1 Proof of Proposition 1.

In order to get this result, we compute the discrete derivative \({\dot{G}}_{n+1}(s,q):={G}_{n+1}(s,q)-{G}_{n}(s,q)\). For convenience of the reader we recall that \(G_n(s,q)\) is given from (2.10) by

$$\begin{aligned} G_n(s,q) = \frac{1}{2} \Vert s (q-x_{n})- \nu _n {\dot{x}}_{n}\Vert ^2+ \frac{1}{2} s(e -s) \Vert x_{n}-q\Vert ^2. \end{aligned}$$
(A.1)

It is then readily noticed that \({\dot{G}}_{n+1}(s,q)\) can be formulated as

$$\begin{aligned} \begin{array}{l} {\dot{G}}_{n+1}(s,q) = s (\nu _{n+1} {a}_{n+1}- \nu _{n} a_n) + s e {\dot{b}}_{n+1} + \nu ^2_{n+1}{c}_{n+1}- \nu _{n}^2{c}_{n} , \end{array} \end{aligned}$$
(A.2)

where

$$\begin{aligned} \hbox {} a_n:=\langle x_n-q, {\dot{x}}_{n}\rangle \hbox {,}\,\, b_n := \frac{1}{2} \Vert x_n-q\Vert ^2\hbox { and }c_n :=\frac{1}{2} \Vert {\dot{x}}_{n}\Vert ^2. \end{aligned}$$

Note also that for any bilinear symmetric form \(\langle .,.\rangle _E\) on a real vector space E and for any sequences \(\{ \phi _{n} , \varphi _{n} \} \subset E\) we have the discrete derivative rules:

$$\begin{aligned}&\langle \phi _{n+1} , \varphi _{n+1} \rangle _E- \langle \phi _{n} , \varphi _{n} \rangle _E=\langle {\dot{\phi }}_{n+1}, \varphi _{n+1} \rangle _E + \langle \phi _{n} , {\dot{\varphi }}_{n+1} \rangle _E, \end{aligned}$$
(A.3a)
$$\begin{aligned}&\langle \phi _{n+1} , \varphi _{n+1} \rangle _E - \langle \phi _{n} , \varphi _{n} \rangle _E=\langle {\dot{\phi }}_{n+1}, \varphi _{n} \rangle _E + \langle \phi _{n+1} , {\dot{\varphi }}_{n+1} \rangle _E. \end{aligned}$$
(A.3b)

The sequel of the proof can be divided into the following parts (1)–(4):

(1) Basic estimates. Setting \(P_n= \langle q-x_{n+1}, {\dot{x}}_{n+1}\rangle \), we establish the elementary but useful facts below:

$$\begin{aligned} {\dot{b}}_{n+1}= & {} -P_n - \hbox { }\ \frac{1}{2} \Vert {\dot{x}}_{n+1}\Vert ^2, \end{aligned}$$
(A.4a)
$$\begin{aligned} \nu _{n+1} a_{n+1} -\nu _n a_n= & {} \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle + e P_n - (e+ \nu _{n+1}) \langle d_{n}, x_{n+1}-q \rangle .\nonumber \\ \end{aligned}$$
(A.4b)
  • Let us prove (A.4a). From \( 2 b_{n+1}= \Vert x_{n+1}-q\Vert ^2\), by the derivative rule (A.3a) we get

    $$\begin{aligned} \begin{array}{l} 2 {\dot{b}}_{n+1} =\langle {\dot{x}}_{n+1}, x_{n+1}-q \rangle + \langle x_{n}-q, {\dot{x}}_{n+1}\rangle \\ =\langle {\dot{x}}_{n+1}, x_{n+1}-q \rangle + \langle x_n -x_{n+1}, {\dot{x}}_{n+1}\rangle + \langle x_{n+1}-q, {\dot{x}}_{n+1}\rangle , \end{array} \end{aligned}$$

    namely \( 2 {\dot{b}}_{n+1} = -2 P_n - \Vert {\dot{x}}_{n+1}\Vert ^2\), which leads to (A.4a).

  • Let us prove (A.4b). From \(a_n= \langle q-x_n, - {\dot{x}}_{n}\rangle \), we simply get

    $$\begin{aligned} \begin{array}{l} a_n=\langle q-x_{n+1}, -{\dot{x}}_{n}\rangle + \langle {\dot{x}}_{n+1}, -{\dot{x}}_{n}\rangle =- \langle q-x_{n+1}, {\dot{x}}_{n}\rangle - \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle . \end{array}\nonumber \\ \end{aligned}$$
    (A.5)

    Moreover, from \(a_n:=\langle x_n-q, {\dot{x}}_{n}\rangle \), by the rule (A.3b) we readily get

    $$\begin{aligned} {\dot{a}}_{n+1} = \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle + \langle x_{n+1}-q , {\dot{x}}_{n+1}-{\dot{x}}_{n}\rangle . \end{aligned}$$
    (A.6)

    In addition, the derivative rule (A.3b) yields

    $$\begin{aligned} \nu _{n+1} a_{n+1} -\nu _n a_n= {\dot{\nu }}_{n+1} a_{n} + {\nu }_{n+1}{\dot{a}}_{n+1}. \end{aligned}$$

    This latter result, in light of (A.5) and (A.6), amounts to

    $$\begin{aligned} \begin{array}{l} a_{n+1}\nu _{n+1} - a_n \nu _n \\ \quad = {\dot{\nu }}_{n+1} \bigg (- \langle q-x_{n+1}, {\dot{x}}_{n}\rangle - \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle \bigg )\\ \qquad + {\nu }_{n+1}\bigg ( \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle - P_n + \langle q-x_{n+1}, {\dot{x}}_{n}\rangle \bigg ) \\ \quad = \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle - \nu _{n+1} P_n + \nu _{n} \langle q-x_{n+1}, {\dot{x}}_{n}\rangle . \end{array} \end{aligned}$$
    (A.7)

    Furthermore, (2.9) gives us \( {\dot{x}}_{n+1}+ d_{n} - \theta _n {\dot{x}}_{n}=0\). Taking the scalar product of each side of this equality by \(q-x_{n+1}\) yields

    $$\begin{aligned} \hbox { }\ \theta _n\langle q-x_{n+1}, {\dot{x}}_{n}\rangle = P_n - \langle d_{n}, x_{n+1}-q \rangle . \end{aligned}$$

    So, recalling that \( \nu _{n} =(e+\nu _{n+1}) \theta _n\) (from (2.9)), we get

    $$\begin{aligned} \nu _{n} \langle q-x_{n+1}, {\dot{x}}_{n}\rangle&=(e+\nu _{n+1}) (\theta _n\langle q-x_{n+1}, {\dot{x}}_{n}\rangle )\\&=(e+\nu _{n+1}) \left( P_n - \langle d_{n}, x_{n+1}-q \rangle \right) , \end{aligned}$$

    which, in light of (A.7), entails

    $$\begin{aligned} \begin{array}{l} a_{n+1}\nu _{n+1} - a_n \nu _n \\ = \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle - \nu _{n+1} P_n+ (e+\nu _{n+1}) \left( P_n - \langle d_{n}, x_{n+1}-q \rangle \right) \\ = \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle + e P_n - (e+\nu _{n+1})\langle d_{n}, x_{n+1}-q \rangle , \end{array} \end{aligned}$$

    that is (A.4b).

(2) An estimate from the inertial part. Now, given \((s,q) \in [0,\infty ) \times {\mathcal {H}}\), we prove that the discrete derivative \( {\dot{G}}_{n+1}(s,q)\) satisfies

$$\begin{aligned} \begin{array}{l} {\dot{G}}_{n+1}(s,q) + s (e+ \nu _{n+1}) \langle d_{n}, x_{n+1}-q \rangle \\ = s \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle - \frac{1}{2} \left( s e - \nu _{n+1}^2 \right) \Vert {\dot{x}}_{n+1}\Vert ^2 - \frac{1}{2} \nu _n ^2 \Vert {\dot{x}}_{n}\Vert ^2. \end{array} \end{aligned}$$
(A.8)

Indeed, in light of (A.2) and (A.4), we obtain

$$\begin{aligned} \begin{array}{l} {\dot{G}}_{n+1} = s \bigg ( \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle +e P_n - (e+ \nu _{n+1}) \langle d_{n}, x_{n+1}-q \rangle \bigg ) \\ \qquad + se \bigg ( -P_n - \frac{1}{2}\Vert {\dot{x}}_{n+1}\Vert ^2 \bigg ) + \frac{1}{2} \bigg ( \nu _{n+1}^2 \Vert {\dot{x}}_{n+1}\Vert ^2- \nu _{n}^2 \Vert {\dot{x}}_{n}\Vert ^2 \bigg ) \\ \quad = s \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle + \frac{1}{2} \left( \nu _{n+1}^2 - se \right) \Vert {\dot{x}}_{n+1}\Vert ^2 - \frac{1}{2} \nu _n ^2 \Vert {\dot{x}}_{n}\Vert ^2\\ \qquad -s (e+ \nu _{n+1}) \langle d_{n}, x_{n+1}-q \rangle , \end{array} \end{aligned}$$

which leads obviously to the desired equality.

(3) An estimate from the proximal part. We prove that, for any \(\xi _n \ne 1\), it holds that

$$\begin{aligned} \begin{array}{l} \xi _n \langle d_n,{\dot{x}}_{n+1}\rangle + \frac{ 1 }{2} \Vert {\dot{x}}_{n+1}- \theta _n{\dot{x}}_n \Vert ^2 \\ = - \theta _n(1-\xi _n)\langle {\dot{x}}_n,{\dot{x}}_{n+1}\rangle + \frac{1}{2} \theta _n^2 \Vert {\dot{x}}_n \Vert ^2 - \left( \xi _n- \frac{1}{2} \right) \Vert {\dot{x}}_{n+1}\Vert ^2. \end{array} \end{aligned}$$
(A.9)

Indeed, we have \({\dot{x}}_{n+1}=\theta _n{\dot{x}}_n -d_n\) (from (2.9)), hence, for any \(\xi _n \ne 1\), and setting \(H_n= {\dot{x}}_{n+1}- (1-\xi _n)^{-1}\theta _n{\dot{x}}_n \), we have

$$\begin{aligned} (1-\xi _n)H_n= (1-\xi _n){\dot{x}}_{n+1}- \theta _n{\dot{x}}_n = -d_n- \xi _n {\dot{x}}_{n+1}, \end{aligned}$$

or equivalently

$$\begin{aligned} \begin{array}{l} \xi _n {\dot{x}}_{n+1}= -(1-\xi _n)H_n - d_n . \end{array} \end{aligned}$$
(A.10)

Furthermore, by \(-d_n= {\dot{x}}_{n+1}- \theta _n{\dot{x}}_n\) (again using (2.9)) we simply obtain

$$\begin{aligned} \begin{array}{l} \langle (-d_n), H_n \rangle = \langle {\dot{x}}_{n+1}- \theta _n{\dot{x}}_{n}, {\dot{x}}_{n+1}- (1-\xi _n)^{-1}\theta _n{\dot{x}}_{n}\rangle \\ = \Vert {\dot{x}}_{n+1}\Vert ^2 + (1-\xi _n)^{-1}\theta _n^2 \Vert {\dot{x}}_{n}\Vert ^2 - \frac{2-\xi _n}{(1-\xi _n)}\theta _n\langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle . \end{array} \end{aligned}$$
(A.11)

Therefore, taking the scalar product of each side of (A.10) with \(d_n\), also adding \((1/2) \Vert d_n\Vert ^2\) to the resulting equality, and next using (A.11) we get

$$\begin{aligned} \begin{array}{l} \xi _n \langle d_n , {\dot{x}}_{n+1}\rangle + \frac{1}{2} \Vert d_n\Vert ^2 = (1-\xi _n)\langle (-d_n), H_n \rangle - \frac{1}{2} \Vert d_n\Vert ^2 \\ = (1-\xi _n)\left( \Vert {\dot{x}}_{n+1}\Vert ^2 + \frac{ \theta _n^2 }{(1-\xi _n)} \Vert {\dot{x}}_{n}\Vert ^2 - \frac{2-\xi _n}{(1-\xi _n)} \theta _n\langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle \right) \\ \;\;- \frac{1}{2}\left( \Vert {\dot{x}}_{n+1}\Vert ^2 + \theta _n^2 \Vert {\dot{x}}_{n}\Vert ^2 - 2 \theta _n\langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle \right) \\ = -(1-\xi _n) \theta _n\langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle + \frac{1}{2}\theta _n^2 \Vert {\dot{x}}_{n}\Vert ^2 + \left( \frac{1}{2} -\xi _n \right) \Vert {\dot{x}}_{n+1}\Vert ^2 . \end{array} \end{aligned}$$

This, while noticing that \( \Vert d_n\Vert ^2= \Vert {\dot{x}}_{n+1}- \theta _n{\dot{x}}_{n}\Vert ^2\) (from (2.9)) , yields (A.9).

(4) Combining proximal and inertial effects. At once we show for \(s \in \left( 0, e \right] \) that the iterates verify (for \(n \ge 0\))

$$\begin{aligned} \begin{array}{l} {\dot{G}}_{n+1}(s,q) + \frac{1}{2} (e+\nu _{n+1})^2 \Vert {\dot{x}}_{n+1}- \theta _n{\dot{x}}_{n}\Vert ^2 \\ + s (e+\nu _{n+1}) \langle d_n, x_{n+1}-q \rangle + (e+\nu _{n+1}) (e-s+\nu _{n+1}) \langle d_n,{\dot{x}}_{n+1}\rangle \\ = -\frac{1}{2} \left( e -s \right) \left( e + 2 \nu _{n+1}\right) \Vert {\dot{x}}_{n+1}\Vert ^2 . \end{array} \end{aligned}$$
(A.12)

Indeed, denoting \(\tau _n=e+\nu _{n+1}\), from (A.8) we know that

$$\begin{aligned} \begin{array}{l} {\dot{G}}_{n+1}(s,q) + s \tau _n \langle d_n, x_{n+1}-q \rangle \\ = s \nu _n \langle {\dot{x}}_{n+1}, {\dot{x}}_{n}\rangle - \frac{1}{2} \left( s e - \nu _{n+1}^2 \right) \Vert {\dot{x}}_{n+1}\Vert ^2 - \frac{1}{2} \nu _n ^2 \Vert {\dot{x}}_{n}\Vert ^2. \end{array} \end{aligned}$$
(A.13)

Furthermore, assuming that \(s \in \left( 0, e \right] \) and taking \(\xi _n=1- s \tau _n^{-1} \) in (A.9), while noticing that \(\tau _n \theta _n=\nu _n \) (from (2.9b)), we obtain

$$\begin{aligned} \begin{array}{l} (1- s \tau _n^{-1} ) \langle d_n ,{\dot{x}}_{n+1}\rangle + \frac{ 1 }{2} \Vert {\dot{x}}_{n+1}- \theta _n{\dot{x}}_{n}\Vert ^2 \\ = -s\theta _n\tau _n^{-1} \langle {\dot{x}}_{n}, {\dot{x}}_{n+1}\rangle + \frac{1}{2} \theta _n^2 \Vert {\dot{x}}_{n}\Vert ^2 - \left( \frac{1}{2} -s \tau _n^{-1} \right) \Vert {\dot{x}}_{n+1}\Vert ^2 \\ = -s\nu _n \tau _n^{-2} \langle {\dot{x}}_{n}, {\dot{x}}_{n+1}\rangle + \frac{1}{2} \nu _n^2 \tau _n^{-2} \Vert {\dot{x}}_{n}\Vert ^2 - \left( \frac{1}{2} -s \tau _n^{-1} \right) \Vert {\dot{x}}_{n+1}\Vert ^2 . \end{array} \end{aligned}$$
(A.14)

Then multiplying equality (A.14) by \(\tau _n^2\) and adding the resulting equality to (A.13) yields

$$\begin{aligned} \begin{array}{l} (1- s \tau _n^{-1} ) \tau _n^2 \langle d_n ,{\dot{x}}_{n+1}\rangle + \frac{1}{2} \tau _n^2 \Vert {\dot{x}}_{n+1}- \theta _n{\dot{x}}_{n}\Vert ^2 \\ \;\;+\; {\dot{G}}_{n+1}(s,q) + s \tau _n \langle d_n , x_{n+1}-q \rangle \\ = \bigg ( - \frac{1}{2} \left( s e - \nu _{n+1}^2 \right) - \tau _n^2 \left( \frac{1}{2} -s \tau _n^{-1} \right) \bigg ) \Vert {\dot{x}}_{n+1}\Vert ^2 = -T_n, \end{array} \end{aligned}$$
(A.15)

where \(T_n\) is defined by

$$\begin{aligned} T_n = \frac{1}{2} \bigg ( se - \nu _{n+1}^2 + \tau _n^2 \left( 1- 2s\tau _n ^{-1} \right) \bigg ) \Vert {\dot{x}}_{n+1}\Vert ^2. \end{aligned}$$
(A.16)

In addition, as \(\tau _n:=e+\nu _{n+1}\), a simple computation yields

$$\begin{aligned} \begin{array}{l} \tau _n^2 \left( 1 -2 s \tau _n^{-1} \right) = e^2+ 2 e \nu _{n+1} + ( \nu _{n+1})^2 - 2s \left( e+ \nu _{n+1} \right) \\ = e \left( e -s \right) - s e + 2 \nu _{n+1} \left( e -s \right) + ( \nu _{n+1} )^2 \\ = \left( e + 2 \nu _{n+1}\right) \left( e -s \right) - s e + ( \nu _{n+1})^2. \end{array} \end{aligned}$$

As a consequence, by (A.16) we obtain

$$\begin{aligned} T_{n}= \frac{\left( e -s \right) }{2} \left( e + 2 \nu _{n+1}\right) \Vert {\dot{x}}_{n+1}\Vert ^2. \end{aligned}$$

This ends the proof \(\square \)

1.2 Proof of Proposition 2.

Let us first prove the result for \(b=1\). Consider the positive sequence \((k_n)_{n \ge 0}\) defined, for \(n \ge 1\) and for some constants \(\{a, c \} \subset [0,\infty )\), by the recursive formula \(\frac{k_{n}}{k_{n-1}}= 1 + \frac{a}{ n + c } \), . As a consequence, from the basic inequalities \(a \ge [a]\) and \(c \le [c]+1\), we get

$$\begin{aligned} \begin{array}{l} \frac{k_{n}}{k_{n-1}} \ge 1 + \frac{[a]}{ n + [c]+1 } = \frac{ n + [c]+[a]+1}{ n + [c]+1 }\end{array}. \end{aligned}$$

Hence, for \(n \ge 1\), we obtain

$$\begin{aligned} \begin{array}{l} \frac{k_n}{k_0} \ge \prod _{j=1}^n \frac{ j + [c]+[a]+1}{ j + [c]+1 } = \frac{ \prod _{j=1}^{n} (j + [c]+[a]+1)}{ \prod _{j=1}^{n} (j + [c]+1) } = \frac{ \prod _{j=2 + [c]+[a]}^{n + [c]+1+[a]} j }{ \prod _{j=2+ [c]}^{n+ [c]+1} j }. \end{array} \end{aligned}$$
(A.17)

Moreover, for \(n \ge 3 + [a]\), we simply get

$$\begin{aligned}&\hbox { }\ \prod _{j=2 + [c]+[a]}^{n + [c]+1+[a]} j = (2 + [c]+[a]) \bigg ( \prod _{j=3 + [c]+[a]}^{n + [c]} j \bigg ) \bigg (\prod _{j=n+ [c]+1 }^{n + [c]+1+[a]} j \bigg ), \end{aligned}$$
(A.18a)
$$\begin{aligned}&\prod _{j=2+ [c]}^{n+ [c]+1} j = \bigg ( \prod _{j=2+ [c]}^{ 2 + [c]+[a] } j \bigg ) \bigg ( \prod _{j=3+ [c]+[a]}^{ n+ [c] } j \bigg ) (n+ [c]+1). \end{aligned}$$
(A.18b)

As a consequence, for \(n \ge 3 + [a]\), by (A.17) in light of (A.18) we get

$$\begin{aligned} \begin{array}{l} k_n \ge k_0 \frac{ (2 + [c]+[a]) \big (\prod _{j=n+ [c]+1 }^{n + [c]+1+[a]} j \big ) }{ \big ( \prod _{j=2+ [c]}^{ 2 + [c]+[a] } j \big ) (n+ [c]+1) }= C_0 \frac{ \prod _{j=n+ [c]+1 }^{n + [c]+1+[a]} j }{ n+ [c]+1 }= C_0 \frac{ \prod _{j= [c]+1 }^{ [c]+1+[a]} (n+j) }{ n+ [c]+1 } , \end{array} \end{aligned}$$

where \(C_0:= k_0 \bigg ( \frac{ 2 + [c]+[a] }{ \prod _{j=2+ [c]}^{ 2 + [c]+[a] } j } \bigg )\), which implies that

$$\begin{aligned} \begin{array}{l} k_n \ge C_0 (n+ [c]+1)^{[a]} \ge C_0 n^{[a]} . \end{array} \end{aligned}$$

The desired result follows immediately from the previous arguments when replacing a and c by \(\frac{a}{b}\) and \(\frac{c}{b}\), respectively. This completes the proof \(\square \)

1.3 Proof of Theorem 1

The proof will be divided into the following steps (a), (b) and (c):

(a) In order to establish (3.29a) and (3.29b), we first prove the following estimates

$$\begin{aligned}&\hbox { }\ \Vert {\dot{x}}_{n}\Vert =o(n^{-1}), \end{aligned}$$
(A.19a)
$$\begin{aligned}&\hbox { }\ k_{n-1}\Vert A_{\lambda }(x_{n})\Vert =o(n^{-1}). \end{aligned}$$
(A.19b)

Indeed, passing to the limit as \(s \rightarrow 0^+\) in (2.8) amounts to

$$\begin{aligned} \begin{aligned}&(b (n+1) +{\bar{c}}-a_1)^2 \Vert {\dot{x}}_{n+1}\Vert ^2 - ( b n +{\bar{c}}-a_1)^2 \Vert {\dot{x}}_{n}\Vert ^2 \\&+\; \lambda k_{n-1} (b n+{\bar{c}}) \left( a \frac{bn+{\bar{c}}}{bn+c}+ a_2 \right) \langle A_{\lambda }(x_{n}) , {\dot{x}}_{n+1}\rangle \\&+\; \frac{1}{2} (a_1-b)\big ( (2n+1)b +2{\bar{c}}-a_1 \big ) \Vert {\dot{x}}_{n+1}\Vert ^2 \le 0. \end{aligned} \end{aligned}$$
(A.20)

Concerning the second quantity in the right side of the above inequality, by \(b>0\), \(a \ge 0\) and \(a_2 >0\) (hence \(a+a_2 >0\)) we readily have

$$\begin{aligned} (bn+{\bar{c}}) \left( a \frac{bn+{\bar{c}}}{bn+c}+ a_2 \right) \sim b (a+a_2)n \hbox { as }n \rightarrow \infty . \end{aligned}$$

Then, in light of \(\sum _n n k_{n-1} |\langle A_{\lambda }(x_{n}), {\dot{x}}_{n+1}\rangle | < \infty \) (from (3.14d)), we infer that

$$\begin{aligned} \hbox { }\ \sum _{n} (bn+{\bar{c}}) \left( a \frac{bn+{\bar{c}}}{bn+c}+ a_2 \right) k_{n-1}| \langle A_{\lambda }(x_{n}) , {\dot{x}}_{n+1}\rangle | < \infty . \end{aligned}$$

It is then a classical matter to derive from (A.20) that there exists some \( l_2 \ge 0\) such that \( \lim _{n \rightarrow \infty } n ^2 \Vert {\dot{x}}_{n}\Vert ^2=l_2\). Moreover, recalling that \(\sum _n n \Vert {\dot{x}}_{n}\Vert ^2 < \infty \) (from (3.3f)), and noticing that \(\sum _n n^{-1}=\infty \), we infer that \( \liminf _{n \rightarrow \infty } n ^2 \Vert {\dot{x}}_{n}\Vert ^2=0\). It follows that \(l_2=0\), that is (A.19a).

Next combining the latter result with \(\Vert {\dot{x}}_{n}+ \lambda k_{n-1} A_{\lambda }(x_{n}) \Vert =o(n^{-1})\) (from (3.14b)) gives us immediately that \( \lim _{n \rightarrow \infty } n k_{n-1} \Vert A_{\lambda }(x_{n})\Vert =0\), that is (A.19b).

Then (3.29a) is given by (3.3f) and (A.19a), while (3.29b) follows from (3.14c) and (A.19b).

(b) The estimates in (3.29c) are derived from (3.3e) and (3.3g), while the estimate (3.29d) is given by (3.3d).

(c) Finally, we prove the weak convergence of the iterates \(\{x_{n}, z_n\}\) given by CRIPA by means of the Opial lemma. This latter result guarantees that \((x_n)\) converges weakly to some element of S, provided that the following results (h1)–(h2) hold: (h1) for any \(q \in S\), the sequence (\(\Vert x_n-q\Vert \)) is convergent; (h2) any weak-cluster point of (\(x_n\)) belongs to S.

- Let us prove (h1). Take \(q \in S\). Clearly, as a straightforward consequence of the bounded-ness of \((x_n)\) (given by (3.3a)) along with (A.19b) we have

$$\begin{aligned} k_{n-1}\langle A_{\lambda }(x_{n}), x_{n}-q \rangle =o(n^{-1}). \end{aligned}$$
(1.21)

Moreover, we know that (\({{\mathcal {E}}}_n(a+a_2,q)\)) is convergent (from Lemma 3) and writes

$$\begin{aligned} \begin{aligned} {{\mathcal {E}}}_n(a+a_2 ,q)&= \frac{1}{2} \Vert (a+a_2) (q-x_{n})- (b n +{\bar{c}}-a_1) {\dot{x}}_{n}\Vert ^2 \\&\quad +\; \frac{1}{2} (a+a_2) \big ( a_1-b-(a+a_2) \big ) \Vert x_{n}-q\Vert ^2 \\&\quad +\; (a+a_2) ( b (n-1)+ {\bar{c}}) ( \lambda k_{n-1}) \langle A_{\lambda }(x_{n}) , x_{n}-q \rangle . \end{aligned} \end{aligned}$$
(1.22)

Then, by \(n \Vert {\dot{x}}_{n}\Vert \rightarrow 0\) (from (A.19a)) and \(n k_{n-1} | \langle A_{\lambda }(x_{n}), x_{n}-q \rangle | \rightarrow 0\) (according to (1.21)) as \(n\rightarrow \infty \), we deduce that

$$\begin{aligned} \lim _{n \rightarrow \infty }{{\mathcal {E}}}_n(a+a_2 ,q)= \lim _{n \rightarrow \infty } \frac{1}{2} (a+a_2 ) (a_1-b) \Vert x_{n}-q\Vert ^2. \end{aligned}$$
(1.23)

This entails (h1).

- Now, we prove (h2). Let u be a weak cluster point of \((x_n)\), namely there exists a subsequence (\(x_{n_k}\)) that converges to u as \(k \rightarrow \infty \). So, in view of (A.19b), we simply have \( \lim _{k \rightarrow +\infty } \Vert A_{\lambda } (x_ {n_k})\Vert =0\), while a classical result gives us \(A_{\lambda } (x_{n_k}) \in A (x_{n_k}-\lambda A_{\lambda } (x_{n_k}))\). Then passing to the limit as \(k \rightarrow \infty \) in this latter result and recalling that the graph of a maximally monotone operator is demi-closed (see, for instance, [13]), we deduce that \(0 \in A(u)\), namely \( u \in S\). This proves (h2).

Then we infer by Opial’s lemma that \((x_n)\) converges weakly to some element \({\bar{x}} \in S\).

Finally, by \(\sum _n n \Vert k_{n-1} A_{\lambda } (x_{n})\Vert ^2 < \infty \) (from (3.14c)), we get \(k_{n-1} A_{\lambda } (x_{n}) \rightarrow 0\) (as \(n \rightarrow \infty \)), hence, by \( z_{n-1}-x_{n}=\lambda k_{n-1} A_{\lambda } (x_{n})\) (from (2.4a)), we are led to \( z_{n-1}-x_{n}\rightarrow 0\) (as \(n \rightarrow \infty \)). It follows that \((z_n)\) converges weakly to the element \({\bar{x}}\). This completes the proof \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Maingé, PE. Accelerated Proximal Algorithms with a Correction Term for Monotone Inclusions. Appl Math Optim 84 (Suppl 2), 2027–2061 (2021). https://doi.org/10.1007/s00245-021-09819-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00245-021-09819-y

Keywords

Navigation