Skip to main content
Log in

Exploiting Characteristics in Stationary Action Problems

  • Published:
Applied Mathematics & Optimization Aims and scope Submit manuscript

Abstract

Connections between the principle of least action and optimal control are explored with a view to describing the trajectories of energy conserving systems, subject to temporal boundary conditions, as solutions of corresponding systems of characteristics equations on arbitrary time horizons. Motivated by the relaxation of least action to stationary action for longer time horizons, due to loss of convexity of the action functional, a corresponding relaxation of optimal control problems to stationary control problems is considered. In characterizing the attendant stationary controls, corresponding to generalized velocity trajectories, an auxiliary stationary control problem is posed with respect to the characteristic system of interest. Using this auxiliary problem, it is shown that the controls rendering the action functional stationary on arbitrary time horizons have a state feedback representation, via a verification theorem, that is consistent with the optimal control on short time horizons. An example is provided to illustrate application via a simple mass-spring system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Baggett, L.W.: Functional Analysis. Dekker, London (1991)

    Google Scholar 

  2. Bardi, M., Capuzzo-Dolcetta, I.: Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations. Systems & Control: Foundations & Application. Birkhauser, Basel (1997)

    Book  Google Scholar 

  3. Basco, V., Frankowska, H.: Lipschitz continuity of the value function for the infinite horizon optimal control problem under state constraints. In: Trends in Control Theory and Partial Differential Equations, Springer INdAM Series, pp. 17–38 (2019)

  4. Cannarsa, P., Sinestrari, C.: Semiconcave Functions, Hamilton-Jacobi Equations, and Optimal Control. Progress in Nonlinear DIfferential Equations and Their Applications. Birkhaüser, Basel (2004)

    MATH  Google Scholar 

  5. Dower, P.M., McEneaney, W.M.: On existence and uniqueness of stationary action trajectories. In: Proceedings 22nd International Symposium on Mathematical Theory of Networks and Systems (Minneapolis MN, USA), pp. 624–631 (2016)

  6. Dower, P.M., McEneaney, W.M.: Solving two-point boundary value problems for a wave equation via the principle of stationary action and optimal control. SIAM J. Control Optim. 55(4), 2151–2205 (2017)

    Article  MathSciNet  Google Scholar 

  7. Dower, P.M., McEneaney, W.M.: Verifying fundamental solution groups for lossless wave equations via stationary action and optimal control. Appl. Math. Optim. (2020). https://doi.org/10.1007/s00245-020-09700-4

    Article  Google Scholar 

  8. Dower, P.M., McEneaney, W.M., Yegorov, I.: Exploiting characteristics in stationary action problems. In: Proceedings of SIAM Conference on Control & Its Applications (Chengdu, China), pp. 75–82 (2019)

  9. Gray, C.G., Taylor, E.F.: When action is not least. Am. J. Phys. 75(5), 434–458 (2007)

    Article  Google Scholar 

  10. McEneaney, W.M., Dower, P.M.: The principle of least action and fundamental solutions of mass-spring and \(n\)-body two-point boundary value problems. SIAM J. Control Optim. 53(5), 2898–2933 (2015)

    Article  MathSciNet  Google Scholar 

  11. McEneaney, W.M., Dower, P.M.: Staticization, its dynamic program, and solution propagation. Automatica 81, 56–67 (2017)

    Article  MathSciNet  Google Scholar 

  12. McEneaney, W.M., Dower, P.M.: Static duality and a stationary action application. J. Differ. Equ. 264(2), 525–549 (2018)

    Article  MathSciNet  Google Scholar 

  13. McEneaney, W.M., Dower, P.M.: Staticization and associated Hamilton-Jacobi and Riccati equations. SIAM Conf. Control Theory Appl. 2015, 376–383 (2015)

    Google Scholar 

  14. McEneaney, W.M., Dower, P.M.: Staticization-based representations for Schrödinger equations driven by Coulomb potentials. In: \(3^{rd}\) IFAC Workshop on Thermodynamic Foundations for a Mathematical Systems Theory (Louvain-la-Neuve, Belgium). https://doi.org/10.1016/j.ifacol.2019.07.005 (2019)

  15. Pazy, A.: Semigroups of Linear Operators and Applications to Partial Differential Equations. Applied Mathematical Sciences, vol. 44. Springer, New York (1983)

    MATH  Google Scholar 

  16. Subbotin, A.I.: Generalized Solutions of First-Order PDEs: The Dynamical Optimization Perspective. Systems & Control: Foundations & Application. Springer, New York (1995)

    Google Scholar 

  17. Yegorov, I., Dower, P.M.: Perspectives on characteristics based curse-of-dimensionality-free numerical approaches to solving Hamilton Jacobi equations. Appl. Math. Optim. (2018)

Download references

Funding

This research was supported by the US Air Force Office of Scientific Research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter M. Dower.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supported by AFOSR Grant FA2386-16-1-4066. A preliminary version of this work appeared in [8].

Appendices

Appendix

Proofs of Lemmas 2, 3, 4, and 5

Proof (Lemma 2)

The proof employs a standard fixed point argument, exploiting global Lipschitz continuity of f of (22), see for example [15, Theorem 5.1, p. 127]. Note that global Lipschitz continuity of \({\nabla }V(x)\) in (22) follows directly from the second bound assumed in (4). \(\square \)

Proof (Lemma 3)

Fix \(T\in {\mathbb {R}}_{\ge 0}\), \(t\in [0,T]\), and \(Y,h\in {\mathscr {X}}^2\). Applying Lemma 2, there exist unique classical solutions \({\overline{X}}(Y)\) and \({\overline{X}}(Y+h)\) to (24) satisfying respectively \({\overline{X}}(Y)_t = Y\) and \({\overline{X}}(Y+h)_t = Y+h\). In integral form,

$$\begin{aligned} \begin{aligned} {\overline{X}}(Y)_s&= Y + \int _t^s f({\overline{X}}(Y)_\sigma ) \, d\sigma ,\\ {\overline{X}}(Y+h)_s&= Y+h + \int _t^s f({\overline{X}}(Y+h)_\sigma ) \, d\sigma , \end{aligned} \end{aligned}$$
(52)

so that

$$\begin{aligned} {\overline{X}}(Y+h)_s - {\overline{X}}(Y)_s&= h + \int _t^s f({\overline{X}}(Y+h)_\sigma ) - f({\overline{X}}(Y)_\sigma ) \, d\sigma \end{aligned}$$

for all \(s\in [t,T]\). Consequently, as f is globally Lipschitz by inspection of (22),

$$\begin{aligned} \Vert {\overline{X}}(Y+h)_s - {\overline{X}}(Y)_s \Vert&\le \Vert h\Vert + \int _t^s \Vert f({\overline{X}}(Y+h)_\sigma ) - f({\overline{X}}(Y)_\sigma ) \Vert \, d\sigma \\&\le \Vert h\Vert + \alpha \int _t^s \Vert {\overline{X}}(Y+h)_\sigma - {\overline{X}}(Y)_\sigma \Vert \, d\sigma \end{aligned}$$

in which \(\alpha \in {\mathbb {R}}_{\ge 0}\) is the associated Lipschitz constant. Applying Gronwall’s inequality, and recalling the definition of \(\Vert \cdot \Vert _\infty \), yields

$$\begin{aligned}&\Vert {\overline{X}}(Y + h) - {\overline{X}}(Y) \Vert _\infty \le \Vert h \Vert \, \exp ( \alpha \, ( T - t) ), \end{aligned}$$

so that (26) holds. As \(Y,h\in {\mathscr {X}}^2\) are arbitrary, the asserted continuity follows. \(\square \)

Proof (Lemma 4)

Fix \(T\in {\mathbb {R}}_{\ge 0}\), \(t\in [0,T]\), and \({\overline{X}}\in C({\mathscr {X}}^2;C([t,T];{\mathscr {X}}^2))\) as per Lemma 3. Fix \(Y\in {\mathscr {X}}^2\) and \(s\mapsto A(Y)_s\) as per (29), and note that (28) follows by [15, Theorem 5.2, p. 128]. Fix any \(h\in {\mathscr {X}}^2\), \(s\in [t,T]\), and note by inspection of (22) that \(A(Y)_s = D f({\overline{X}}(Y)_s)\). Hence, recalling (52),

$$\begin{aligned}&{\overline{X}}(Y+h)_s - {\overline{X}}(Y)_s - U_{s,t}(Y)\, h = \int _t^s f({\overline{X}}(Y+h)_\sigma ) - f({\overline{X}}(Y)_\sigma ) - A(Y)_\sigma \, U_{\sigma , t}(Y) \, h \, d\sigma \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad = \int _t^s f({\overline{X}}(Y)_\sigma + [{\overline{X}}(Y+h)_\sigma - {\overline{X}}(Y)_\sigma ]) - f({\overline{X}}(Y)_\sigma ) - D f({\overline{X}}(Y)_\sigma )\, U_{\sigma , t}(Y) \, h \, d\sigma \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad = \int _t^s f({\overline{X}}(Y)_\sigma + [{\overline{X}}(Y+h)_\sigma - {\overline{X}}(Y)_\sigma ]) - f({\overline{X}}(Y)_\sigma ) - D f({\overline{X}}(Y)_\sigma )\, [ {\overline{X}}(Y+h)_\sigma - {\overline{X}}(Y)_\sigma ] \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad + D f({\overline{X}}(Y)_\sigma )\, [ {\overline{X}}(Y+h)_\sigma - {\overline{X}}(Y)_\sigma - U_{\sigma , t}(Y) \, h] \, d\sigma . \end{aligned}$$
(53)

Define \(\bar{I}_f:C([t,T];{\mathscr {X}}^2)\rightarrow C([t,T];{\mathscr {X}}^2)\) by

$$\begin{aligned}&\bar{I}_f(X)_s \doteq \int _t^s f(X_\sigma ) \, d\sigma \end{aligned}$$
(54)

for all \(X\in C([t,T];{\mathscr {X}}^2)\). Note that \(Y\mapsto f(Y)\) is twice Fréchet differentiable by (4), with \(D^2 f(Y)\in {\mathcal {L}}({\mathscr {X}}^2;{\mathcal {L}}({\mathscr {X}}^2)) = {\mathcal {L}}({\mathscr {X}}^2\times {\mathscr {X}}^2;{\mathscr {X}}^2)\) for all \(Y\in {\mathscr {X}}^2\). Again by (4), there exists an \(M\in {\mathbb {R}}_{>0}\) such that

$$\begin{aligned}&\sup _{Y\in {\mathscr {X}}^2} \Vert D^2 f(Y) \Vert _{{\mathcal {L}}({\mathscr {X}}^2\times {\mathscr {X}}^2;{\mathscr {X}}^2)} \le M < \infty . \end{aligned}$$

Hence, by the mean value theorem, given \(X,\delta \in C([t,T];{\mathscr {X}}^2)\),

$$\begin{aligned}&\left\| \bar{I}_f(X + \delta )_s - \bar{I}_f(X)_s - \int _t^s D f(X_\sigma )\, \delta _\sigma \, d\sigma \right\| \le \int _t^s \Vert f(X_\sigma + \delta _\sigma ) - f(X_\sigma ) - D f(X_\sigma )\, \delta _\sigma \Vert \, d\sigma \\&\quad = \int _t^s \left\| \left( {{ \int _0^1 \int _0^1 D^2 f(X_\sigma + \hat{\eta }\, \eta \, \delta _\sigma )\, d\hat{\eta }\, \eta \, d\eta }} \right) (\delta _\sigma , \delta _\sigma ) \right\| d\sigma \\&\quad \le \int _t^s {{\int _0^1 \int _0^1 \Vert D^2 f(X_\sigma + \eta \, \delta _\sigma )\Vert _{{\mathcal {L}}({\mathscr {X}}^2\times {\mathscr {X}}^2;{\mathscr {X}}^2)} d\hat{\eta }\, \eta \, d\eta }}\, \Vert \delta _\sigma \Vert ^2 \, d\sigma \\&\quad \le {\textstyle {\frac{M}{2}}} \int _t^s \Vert \delta _\sigma \Vert ^2 \, d\sigma \le {\textstyle {\frac{M}{2}}}\, (s-t)\, \Vert \delta \Vert _\infty ^2. \end{aligned}$$

That is,

$$\begin{aligned}&\left\| \bar{I}_f(X + \delta ) - \bar{I}_f(X) - \int _t^{(\cdot )} D f(X_\sigma )\, \delta _\sigma \, d\sigma \right\| _\infty \le {\textstyle {\frac{M}{2}}}\, (T-t)\, \Vert \delta \Vert _\infty ^2, \end{aligned}$$

so that \(\bar{I}_f\) is Fréchet differentiable with derivative

$$\begin{aligned}{}[D \bar{I}_f(X)\, \delta ]_s&= \int _t^s D f(X_\sigma )\, \delta _\sigma \, d\sigma \end{aligned}$$
(55)

for all \(X,\delta \in C([t,T];{\mathscr {X}}^2)\), \(s\in [t,T]\). So, recalling (53), and (1),

$$\begin{aligned} {\overline{X}}(Y+h)_s - {\overline{X}}(Y)_s - U_{s,t}(Y)\, h&= [d[{\bar{I}_f}]_{{\overline{X}}(Y)} ( {\overline{X}}(Y+h) - {\overline{X}}(Y) )]_s \, \Vert {\overline{X}}(Y+h) - {\overline{X}}(Y) \Vert _\infty \\&\quad + \int _t^s Df({\overline{X}}(Y)_\sigma ) \, [ {\overline{X}}(Y+h)_\sigma - {\overline{X}}(Y)_\sigma - U_{\sigma ,t}(Y)\, h ]\, d\sigma . \end{aligned}$$

Noting that \(L\doteq \sup _{\sigma \in [t,T]} \Vert D f({\overline{X}}(Y)_\sigma ) \Vert _{{\mathcal {L}}({\mathscr {X}}^2)} < \infty \), taking the norm of both sides yields

$$\begin{aligned}&\Vert {\overline{X}}(Y+h)_s - {\overline{X}}(Y)_s - U_{s,t}(Y)\, h \Vert \\&\quad \le \Vert d[{\bar{I}_f}]_{{\overline{X}}(Y)} ( {\overline{X}}(Y+h) - {\overline{X}}(Y) ) \Vert _\infty \, \Vert {\overline{X}}(Y+h) - {\overline{X}}(Y) \Vert _\infty \\&\qquad + \int _t^s L\, \Vert {\overline{X}}(Y+h)_\sigma - {\overline{X}}(Y)_\sigma - U_{\sigma ,t}(Y)\, h \Vert \, d\sigma . \end{aligned}$$

Hence, by Gronwall’s inequality,

$$\begin{aligned}&\Vert {\overline{X}}(Y+h)_s - {\overline{X}}(Y)_s - U_{s,t}(Y)\, h \Vert \\&\le \Vert d[{\bar{I}_f}]_{{\overline{X}}(Y)} ( {\overline{X}}(Y+h) - {\overline{X}}(Y) ) \Vert _\infty \, \Vert {\overline{X}}(Y+h) - {\overline{X}}(Y) \Vert _\infty \, \exp (L\, (T-t)), \end{aligned}$$

or, with \(\theta _Y(h) \doteq \Vert d[{\bar{I}_f}]_{{\overline{X}}(Y)} ( {\overline{X}}(Y+h) - {\overline{X}}(Y) ) \Vert _\infty \),

$$\begin{aligned}&\Vert {\overline{X}}(Y+h) - {\overline{X}}(Y) - U_{\cdot ,t}(Y)\, h \Vert _\infty \\&\quad \le \theta _Y(h) \, \Vert {\overline{X}}(Y+h) - {\overline{X}}(Y) \Vert _\infty \, \exp (L\, (T-t))\\&\quad \le \theta _Y(h) \, \Vert {\overline{X}}(Y+h) - {\overline{X}}(Y) - U_{\cdot ,t}(Y) \, h \Vert _\infty \, \exp (L\, (T-t))\\&\qquad + \theta _Y(h) \, \sup _{s\in [t,T]} \Vert U_{s,t}(Y)\Vert _{{\mathcal {L}}({\mathscr {X}}^2)} \, \Vert h \Vert \, \exp (L\, (T-t)). \end{aligned}$$

As \(\theta _Y\) is continuous at 0, there exists an \(r>0\) sufficiently small such that \(\Vert h\Vert < r\) implies that \(\theta _Y(h) \exp (L\, (T-t)) < {{\textstyle {\frac{1}{2}}}}\). Hence, with \(\Vert h\Vert < r\),

$$\begin{aligned} \Vert {\overline{X}}(Y+h) - {\overline{X}}(Y) - U_{\cdot ,t}(Y)\, h \Vert _\infty&< 2\, \theta _Y(h) \, \sup _{s\in [t,T]} \Vert U_{s,t}(Y) \Vert _{{\mathcal {L}}({\mathscr {X}}^2)} \, \Vert h \Vert \, \exp (L\, (T-t))\\&= Q\ \theta _Y(h) \, \Vert h\Vert , \end{aligned}$$

in which \(Q \doteq 2\, \sup _{s\in [t,T]} \Vert U_{s,t}(Y) \Vert _{{\mathcal {L}}({\mathscr {X}}^2)}\, \exp (L\, (T-t))\). Consequently, taking a limit,

$$\begin{aligned}&\lim _{\Vert h\Vert \rightarrow 0} \frac{\Vert {\overline{X}}(Y+h) - {\overline{X}}(Y) - U_{\cdot ,t}(Y)\, h \Vert _\infty }{\Vert h\Vert } \le \lim _{\Vert h\Vert \rightarrow 0} Q\, \theta _Y(h) = 0. \end{aligned}$$

That is, \(Y\mapsto {\overline{X}}(Y)\) is Fréchet differentiable, with the indicated derivative. \(\square \)

Proof (Lemma 5)

Fix \(T\in {\mathbb {R}}_{>0}\), \(t\in [0,T]\) as per the lemma statement. It is first demonstrated that \(Y\mapsto U_{s,r}(Y)\) is continuous, uniformly in \(r,s\in [t,T]\), as this motivates the subsequent proof of continuous differentiability. Fix \(r,s\in [t,T]\), \(h,\hat{h}\in {\mathscr {X}}^2\). As \(U_{s,r}(Y)\in {\mathcal {L}}({\mathscr {X}}^2)\) is an element of the two-parameter family of evolution operators generated by \(A(Y)_s\in {\mathcal {L}}({\mathscr {X}}^2)\), see (29),

$$\begin{aligned} U_{s,r}(Y)\, h&= h + \int _r^s A(Y)_\sigma \, U_{\sigma ,r}(Y) \, h \, d\sigma ,\\ U_{s,r}(Y+\hat{h})\, h&= h + \int _r^s A(Y + \hat{h})_\sigma \, U_{\sigma ,r}(Y+\hat{h}) \, h \, d\sigma , \end{aligned}$$

so that

$$\begin{aligned}&[U_{s,r} (Y+ \hat{h}) - U_{s,r}(Y)]\, h = \int _r^s [ A(Y + \hat{h})_\sigma \, U_{\sigma ,r}(Y+\hat{h}) - A(Y)_\sigma \, U_{\sigma ,r}(Y) ]\, h \, d\sigma \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad = \int _r^s [ A(Y + \hat{h})_\sigma - A(Y)_\sigma ]\, [ U_{\sigma ,r}(Y+\hat{h}) - U_{\sigma ,r}(Y) ]\, h \, d\sigma \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad + \int _r^s A(Y)_\sigma \, [ U_{\sigma ,r}(Y+\hat{h}) - U_{\sigma ,r}(Y) ]\, h\, d\sigma + \int _r^s [ A(Y + \hat{h})_\sigma - A(Y)_\sigma ] \, U_{\sigma ,r}(Y) \, h\, d\sigma . \end{aligned}$$
(56)

Hence, by the triangle inequality,

$$\begin{aligned}&\Vert [ U_{s,r} (Y+ \hat{h}) - U_{s,r}(Y) ]\, h \Vert \nonumber \\&\quad \le \int _r^s \Vert A(Y + \hat{h})_\sigma - A(Y)_\sigma \Vert _{{\mathcal {L}}({\mathscr {X}}^2)} \, \Vert [ U_{\sigma ,r}(Y+\hat{h}) - U_{\sigma ,r}(Y) ]\, h\Vert \, d\sigma \nonumber \\&\qquad + \int _r^s \Vert A(Y)_\sigma \Vert _{{\mathcal {L}}({\mathscr {X}}^2)} \, \Vert [ U_{\sigma ,r}(Y+\hat{h}) - U_{\sigma ,r}(Y) ]\, h\Vert \, d\sigma \nonumber \\&\qquad + \int _r^s \Vert A(Y + \hat{h})_\sigma - A(Y)_\sigma \Vert _{{\mathcal {L}}({\mathscr {X}}^2)} \, \Vert U_{\sigma ,r}(Y) \, h \Vert \, d\sigma . \end{aligned}$$
(57)

Recalling (4), and in particular the uniform bound on \(x\mapsto D{\nabla }^2 V(x)\), given \(x,\bar{x}\in {\mathscr {X}}^2\), the mean value theorem implies that \({\nabla }^2 V(x) - {\nabla }^2 V(\bar{x}) = ( \int _0^1 D{\nabla }^2 V(\bar{x} + \eta \, (x - \bar{x})) \, d\eta ) (x - \bar{x})\), so that \(\Vert {\nabla }^2 V(x) - {\nabla }^2 V(\bar{x})\Vert _{{\mathcal {L}}({\mathscr {X}})} \le \frac{K}{2}\, \Vert x - \bar{x}\Vert \). Hence, by (29), there exists an \(\alpha _1\in {\mathbb {R}}_{\ge 0}\) such that \(\varLambda :{\mathscr {X}}^2\rightarrow {\mathcal {L}}({\mathscr {X}}^2)\) satisfies \(\Vert \varLambda (Z) - \varLambda (\bar{Z}) \Vert _{{\mathcal {L}}({\mathscr {X}}^2)} \le \alpha _1 \Vert Z - \bar{Z} \Vert \) for all \(Z,\bar{Z}\in {\mathscr {X}}^2\). So, applying Lemma 3, there exists an \(\alpha \in {\mathbb {R}}_{\ge 0}\), \(L_0 \doteq \sup _{\sigma \in [t,T]}\Vert A(0)_\sigma \Vert _{{\mathcal {L}}({\mathscr {X}}^2)}<\infty \), \(L_1 \doteq \alpha _1\, \exp (\alpha \, (T-t))<\infty \), such that

$$\begin{aligned} \sup _{\sigma \in [t,T]} \Vert A(Y+\hat{h})_\sigma - A(Y)_\sigma \Vert _{{\mathcal {L}}({\mathscr {X}}^2)}&\le \alpha _1 \sup _{\sigma \in [t,T]} \Vert {\overline{X}}(Y+\hat{h})_\sigma - {\overline{X}}(Y)_\sigma \Vert \le L_1 \, \Vert \hat{h}\Vert , \nonumber \\ \sup _{\sigma \in [t,T]} \Vert A(Y)_\sigma \Vert _{{\mathcal {L}}({\mathscr {X}}^2)}&\le L_0 + L_1 \, \Vert \hat{h} \Vert , \end{aligned}$$
(58)

in which the second inequality follows from the first, via the triangle inequality, by selecting \(\hat{h} = -Y\). Note further that as \(\sigma \mapsto A(Y)_\sigma \) is continuous, \(L_2 \doteq \sup _{\sigma \in [t,T]} \Vert U_{\sigma , t} (Y) \Vert _{{\mathcal {L}}({\mathscr {X}}^2)} < \infty \), see [15, Theorem 5.2, p.128]. Hence, substituting these inequalities in (57) yields

$$\begin{aligned} \Vert [ U_{s,r} (Y+ \hat{h}) - U_{s,r}(Y) ]\, h \Vert&\le (L_0 + 2\, L_1 \, \Vert \hat{h}\Vert ) \int _r^s \Vert [ U_{\sigma ,r}(Y+\hat{h}) - U_{\sigma ,r}(Y) ]\, h\Vert \, d\sigma \\&\qquad + (T-t)\, L_1 \, L_2\, \Vert \hat{h}\Vert \, \Vert h\Vert . \end{aligned}$$

Gronwall’s inequality subsequently implies that

$$\begin{aligned} \sup _{r,s\in [t,T]} \Vert U_{s,r} (Y+ \hat{h}) - U_{s,r}(Y)\Vert _{{\mathcal {L}}({\mathscr {X}}^2)} \le (T-t)\, L_1\, L_2\, \Vert \hat{h}\Vert \, \exp ( (L_0 + 2\, L_1 \, \Vert \hat{h}\Vert ) (T-t)). \end{aligned}$$
(59)

Continuity of \(Y\mapsto U_{s,r}(Y)\), uniformly in \(r,s\in [t,T]\), thus follows.

Next, \(Y\mapsto U_{s,r}(Y)\) is shown to be Fréchet differentiable, uniformly in \(r,s\in [t,T]\). Appealing to the contraction theorem and Picard’s principle, for any \(t\le r<s\le T\) and \(Y\in {\mathscr {X}}\), consider the two-parameter family of operators \(V_{s,r}(Y)\in {\mathcal {L}}({\mathscr {X}}^2;{\mathcal {L}}({\mathscr {X}}^2))\) solving

$$\begin{aligned} V_{s,r}(Y)\, \hat{h} \, h&= \int _r^s A(Y)_\sigma \, V_{\sigma ,r}(Y)\, \hat{h}\, h\, d\sigma + \int _r^s D_Y A(Y)_\sigma \, \hat{h}\, U_{\sigma ,r}(Y)\, h\, d\sigma \end{aligned}$$
(60)

for all \(h, \hat{h}\in {\mathscr {X}}^2\), \(r,s\in [t,T]\), in which \(D_Y A(Y)_\sigma = D \varLambda ({\overline{X}}(Y)_\sigma )\, U_{\sigma ,t}(Y) \in {\mathcal {L}}({\mathscr {X}}^2;{\mathcal {L}}({\mathscr {X}}^2))\) by the chain rule and Lemma 4. Note in particular by (4), (29), and Lemma 3 that

$$\begin{aligned} L_3&\doteq \sup _{\sigma \in [t,T]} \Vert D \varLambda ({\overline{X}}(Y)_\sigma ) \Vert _{{\mathcal {L}}({\mathscr {X}}^2;{\mathcal {L}}({\mathscr {X}}^2))} < \infty . \end{aligned}$$

Applying the triangle inequality to (60), and recalling the definitions of \(L_0\), \(L_1\), \(L_2\), yields

$$\begin{aligned} \Vert V_{s,r}(Y)\, \hat{h} \, h \Vert&\le \int _r^s (L_0 + L_1\, \Vert \hat{h}\Vert ) \, \Vert V_{s,r}(Y)\, \hat{h}\, h\Vert \, d\sigma + \int _r^s L_3\, \Vert \hat{h}\Vert \, L_2\, \Vert h\Vert \, d\sigma \\&\le (T-t)\, L_2\, L_3\, \Vert \hat{h}\Vert \, \Vert h\Vert + (L_0 + L_1\, \Vert \hat{h}\Vert ) \int _r^s \Vert V_{s,t}(Y)\, \hat{h}\, h\Vert \, d\sigma , \end{aligned}$$

so that by Gronwall’s inequality,

$$\begin{aligned} \Vert V_{s,r}(Y)\, \hat{h} \, h \Vert&\le (T-t)\, L_2\, L_3\, \Vert \hat{h}\Vert \, \Vert h\Vert \, \exp \left( (L_0 + L_1\, \Vert \hat{h}\Vert )\, (T-t) \right) . \end{aligned}$$

As \(\hat{h}, h\in {\mathscr {X}}^2\) are arbitrary, it follows immediately that \(V_{s,r}(Y)\in {\mathcal {L}}({\mathscr {X}}^2;{\mathcal {L}}({\mathscr {X}}^2))\) for all \(r,s\in [t,T]\). Recalling (56), observe by adding and subtracting terms that

$$\begin{aligned}&[U_{s,r} (Y+ \hat{h}) - U_{s,r}(Y) - V_{s,r}(Y)\, \hat{h}]\, h \nonumber \\&\quad = \int _r^s [ A(Y + \hat{h})_\sigma \, U_{\sigma ,r}(Y+\hat{h}) - A(Y)_\sigma \, U_{\sigma ,r}(Y) ]\, h \, d\sigma - V_{s,r}(Y)\, \hat{h}\, h \nonumber \\&\quad = \int _r^s A(Y)_\sigma \, [ U_{\sigma ,r}(Y+\hat{h}) - U_{\sigma ,r}(Y) - V_{\sigma ,r}(Y)\, \hat{h} ]\, h\, d\sigma \nonumber \\&\quad \quad \quad + \int _r^s [ A(Y + \hat{h})_\sigma - A(Y)_\sigma ]\, [ U_{\sigma ,r}(Y+\hat{h}) - U_{\sigma ,r}(Y) ]\, h \, d\sigma \nonumber \\&\quad \quad \quad + \int _r^s [ A(Y + \hat{h})_\sigma - A(Y)_\sigma - D_Y A(Y)_\sigma \, \hat{h} ] \, U_{\sigma ,r}(Y) \, h\, d\sigma \nonumber \\&\quad \quad \quad - \left[ V_{s,r}(Y)\, \hat{h}\, h - \int _r^s A(Y)_\sigma \, V_{\sigma ,r}(Y)\, \hat{h} \, h\, d\sigma - \int _r^s D_Y A(Y)_\sigma \, \hat{h} \, U_{\sigma ,r}(Y) \, h\, d\sigma \right] , \end{aligned}$$
(61)

and the last term in square brackets is zero by definition (60) of \(V_{s,r}(Y)\). Define \(\hat{A}:C([t,T];{\mathscr {X}}^2)\rightarrow C([t,T];{\mathcal {L}}({\mathscr {X}}^2))\) by \(\hat{A}(X)_\sigma \doteq A(X_\sigma )\) for all \(X\in C([t,T];{\mathscr {X}}^2)\), and note that the range of \(\hat{A}\) follows by (4), (29). Fix \(X,\delta \in C([t,T];{\mathscr {X}}^2)\), and (for convenience) write \(X_\sigma = ([X_1]_\sigma ], [X_2]_\sigma ])\in {\mathscr {X}}^2\), \(\delta _\sigma = ([\delta _1]_\sigma , [\delta _2]_\sigma )\in {\mathscr {X}}^2\) for all \(\sigma \in [t,T]\), with \(X_1,X_2,\delta _1,\delta _2\in C([t,T];{\mathscr {X}})\). Combining (4), (29) with the mean value theorem, there exists \(\hat{\alpha }\in {\mathbb {R}}_{\ge 0}\) such that

$$\begin{aligned}&\Vert \hat{A}(X + \delta ) - \hat{A}(X) - D A(X)\, \delta \Vert _{C([t,T];{\mathcal {L}}({\mathscr {X}}^2))} \nonumber \\&= \sup _{\sigma \in [t,T]} \Vert A(X_\sigma + \delta _\sigma ) - A(X_\sigma ) - D A(X_\sigma )\, \delta _\sigma \Vert _{{\mathcal {L}}({\mathscr {X}}^2)} \nonumber \\&= \sup _{\sigma \in [t,T]} \left\| \left( \begin{array}{cc} 0 &{} 0 \\ {\nabla }^2 V([X_1]_\sigma +[\delta _1]_\sigma ) - {\nabla }^2 V([X_1]_\sigma ]) - D{\nabla }^2 V([X_1]_\sigma )\, [\delta _1]_\sigma &{} 0 \end{array} \right) \right\| _{{\mathcal {L}}({\mathscr {X}}^2)} \nonumber \\&= \hat{\alpha }\, \sup _{\sigma \in [t,T]} \Vert {\nabla }^2 V([X_1]_\sigma +[\delta _1]_\sigma ) - {\nabla }^2 V([X_1]_\sigma ) - D{\nabla }^2 V([X_1]_\sigma )\, [\delta _1]_\sigma \Vert _{{{{\mathcal {L}}({\mathscr {X}})}}} \nonumber \\&= \hat{\alpha }\, \sup _{\sigma \in [t,T]} \left\| \left( \int _0^1 \int _0^1 D^2 {\nabla }^2 V([X_1]_\sigma + \hat{\eta }\, \eta \, [\delta _1]_\sigma ) \, d\hat{\eta }\, \eta \, d\eta \right) ([\delta _1]_\sigma , [\delta _1]_\sigma ) \right\| _{{\mathcal {L}}({\mathscr {X}})} \nonumber \\&\le \hat{\alpha }\, \sup _{\sigma \in [t,T]} \sup _{\hat{\eta },\eta \in [0,1]} \left\| D^2{\nabla }^2 V([X_1]_\sigma + \hat{\eta }\, \eta \, [\delta _1]_\sigma ) \right\| _{{\mathcal {L}}({\mathscr {X}}\times {\mathscr {X}};{\mathcal {L}}({\mathscr {X}}))} \sup _{\sigma \in [t,T]}\Vert [\delta _1]_\sigma \Vert _{{\mathscr {X}}}^2 \nonumber \\&\le \hat{\alpha }\, ({\textstyle {\frac{K}{2}}})\, \Vert \delta \Vert _{C([t,T];{\mathscr {X}}^2)}^2 \end{aligned}$$
(62)

for all \(X,\delta \in C([t,T];{\mathscr {X}}^2)\). Dividing both sides by \(\Vert \delta \Vert _{C([t,T];{\mathscr {X}}^2)}\) and taking the limit as \(\Vert \delta \Vert _{C([t,T];{\mathscr {X}}^2)}\rightarrow 0\) subsequently yields that \(\hat{A}\) is Fréchet differentiable with derivative \(D\hat{A}(X)\in {\mathcal {L}}(C([t,T];{\mathscr {X}}^2);C([t,T];{\mathcal {L}}({\mathscr {X}}^2)\). Hence, taking the norm of both sides of (61), applying the triangle inequality, (59), (62), and recalling the definitions of \(L_1\), \(L_2\), \(L_3\),

$$\begin{aligned}&\Vert [U_{s,r} (Y+ \hat{h}) - U_{s,r}(Y) - V_{s,r}(Y)\, \hat{h}]\, h \Vert \\&\quad \le (L_0 + L_1\, \Vert \hat{h}\Vert ) \int _r^s \Vert [U_{s,r} (Y+ \hat{h}) - U_{s,r}(Y) - V_{s,r}(Y)\, \hat{h}]\, h \Vert \, d\sigma \\&\quad \quad + (T-t)^2\, L_1^2\, L_2\, \Vert \hat{h}\Vert ^2 \, \exp ( (L_0 + 2\, L_1 \, \Vert \hat{h}\Vert ) (T-t))\, \Vert h\Vert \\&\quad \quad + (T-t) \, L_2\, \Vert \hat{A}\circ {\overline{X}}(Y+\hat{h}) - \hat{A}\circ {\overline{X}}(Y) - D \hat{A}({\overline{X}}(Y))\, D{\overline{X}}(Y)\, \hat{h} \Vert _{C([t,T];{\mathcal {L}}({\mathscr {X}}^2))} \, \Vert h\Vert \\&\quad = (L_0 + L_1\, \Vert \hat{h}\Vert ) \int _r^s \Vert [U_{s,r} (Y+ \hat{h}) - U_{s,r}(Y) - V_{s,r}(Y)\, \hat{h}]\, h \Vert \, d\sigma \\&\quad \quad + (T-t)^2\, L_1^2\, L_2\, \Vert \hat{h}\Vert ^2 \, \exp ( (L_0 + 2\, L_1 \, \Vert \hat{h}\Vert ) (T-t))\, \Vert h\Vert \\&\quad \quad + (T-t)\, L_2\, \Vert d(\hat{A}\circ {\overline{X}})_{Y}(\hat{h})\Vert _{C([t,T];{\mathcal {L}}({\mathscr {X}}^2))}\, \Vert \hat{h}\Vert \, \Vert h\Vert , \end{aligned}$$

in which \(d(\hat{A}\circ {\overline{X}})_Y(\cdot )\) is defined via (1). Hence, by Gronwall’s inequality,

$$\begin{aligned}&\Vert [U_{s,r} (Y+ \hat{h}) - U_{s,r}(Y) - V_{s,r}(Y)\, \hat{h}]\, h \Vert \\&\quad \le (T-t)\, L_2\, \bigl [ (T-t)\, L_1^2\, \Vert \hat{h}\Vert \, \exp ( (L_0 + 2\, L_1 \, \Vert \hat{h}\Vert ) (T-t)) \\&\quad \quad + \Vert d(\hat{A}\circ {\overline{X}})_{Y}(\hat{h})\Vert _{C([t,T];{\mathcal {L}}({\mathscr {X}}^2))} \bigr ] \Vert \hat{h}\Vert \, \Vert h\Vert \\&\quad \quad \times \exp ( (L_0 + L_1\, \Vert \hat{h}\Vert )\, (T-t)). \end{aligned}$$

As \(\hat{h}, h\in {\mathscr {X}}^2\) are arbitrary,

$$\begin{aligned}&\lim _{\Vert \hat{h}\Vert \rightarrow 0} \frac{\sup _{r,s\in [t,T]} \Vert U_{s,r} (Y+ \hat{h}) - U_{s,r}(Y) - V_{s,r}(Y)\, \hat{h} \Vert _{{\mathcal {L}}({\mathscr {X}}^2)}}{\Vert \hat{h}\Vert }\nonumber \\&\quad \le \lim _{\Vert \hat{h}\Vert \rightarrow 0} \Vert d(\hat{A}\circ {\overline{X}})_{Y}(\hat{h})\Vert _{C([t,T];{\mathcal {L}}({\mathscr {X}}^2))} = 0. \end{aligned}$$
(63)

Hence, \(Y\mapsto U_{s,r}(Y)\) is Fréchet differentiable, uniformly in \(r,s\in [t,T]\), with derivative \(V_{s,r}(Y)\).

It remains to be shown that \(Y\mapsto U_{s,r}(Y)\) is twice Fréchet differentiable via (60). To this end, define \(\upsilon _s\doteq V_{s,r}(Y) \, \hat{h}\in {\mathcal {L}}({\mathscr {X}}^2)\) and \(w_s\doteq D_Y A(Y)_s\, \hat{h}\, U_{s,r}(Y)\in {\mathcal {L}}({\mathscr {X}}^2)\) for all \(s\in [t,T]\), and note by (60) that

$$\begin{aligned} \upsilon _s&= \int _r^s A(Y)_\sigma \, \upsilon _\sigma \, + w_\sigma \, d\sigma , \end{aligned}$$

for all \(s\in [t,T]\), recalling that \(h\in {\mathscr {X}}\) in (60) is arbitrary. Equivalently, \(s\mapsto \upsilon _s\) is the unique solution of the IVP \(\dot{\upsilon }_s = A(Y)_s \, \upsilon _s + w_s\) for all \(s\in (t,T)\), subject to \(\upsilon _r = 0\in {\mathcal {L}}({\mathscr {X}}^2)\). By definition, \(s\mapsto A(Y)_s\) generates the two-parameter family \(U_{s,r}(Y)\), \(r,s\in [t,T]\), so that \(s\mapsto \upsilon _s = V_{s,r}(Y)\, \hat{h}\) satisfies

$$\begin{aligned} V_{s,r}(Y)\, \hat{h} = \upsilon _s&= U_{s,r}(Y)\, \upsilon _r + \int _r^s U_{s,\sigma }(Y)\, w_\sigma \, d\sigma = \int _r^s U_{s,\sigma }(Y)\, w_\sigma \, d\sigma \nonumber \\&= \int _r^s U_{s,\sigma }(Y) \, D_Y A(Y)_\sigma \, \hat{h}\, U_{\sigma ,r}(Y)\, d\sigma \end{aligned}$$
(64)

for all \(r,s\in [t,T]\), in which the third equality follows as \(v_r = V_{r,r}(Y)\, \hat{h} = 0\in {\mathcal {L}}({\mathscr {X}}^2)\), either by (60) or directly as \(V_{r,r}(Y) \doteq D_Y U_{r,r}(Y) = D_Y I = 0\). Hence, by inspection of (64), the map \(Y\mapsto V_{s,r}(Y)\doteq D_Y U_{s,r}(Y)\) is also Fréchet differentiable, with

$$\begin{aligned} D_Y V_{s,r}(Y) \, h \, \hat{h}&= \int _r^s V_{s,\sigma }(Y)\, h\, D_Y A(Y)_\sigma \, \hat{h} \, U_{\sigma ,r}(Y) \\&\quad + U_{s,\sigma }(Y) \, D_Y^2 A(Y)_\sigma \, h\, \hat{h}\, U_{\sigma ,r}(Y) + U_{s,\sigma }(Y) \, D_Y A(Y)_\sigma \, \hat{h}\, V_{\sigma ,r}(Y)\, h \, d\sigma , \end{aligned}$$

in which \(D_Y^2 A(Y)_\sigma \in {\mathcal {L}}({\mathscr {X}}^2\times {\mathscr {X}}^2;{\mathcal {L}}({\mathscr {X}}^2))\), \(\sigma \in [t,T]\), exists by (29) and (4).

\(\square \)

An Auxiliary Statement of Proposition 2

Proposition 3

Given \(T\in {\mathbb {R}}_{>0}\), \(t\in [0,T)\), \(x,p\in {\mathscr {X}}\), and \((\bar{x}_s, \bar{p}_s) \doteq {\overline{X}}(Y_p(x))_s\) for all \(s\in [t,T]\), the maps \(s\mapsto {\nabla }_p \bar{J}_T(s,\bar{x}_s, \bar{p}_s)\) and \(s\mapsto {\nabla }_x \bar{J}_T(s,\bar{x}_s, \bar{p}_s)\) are continuously differentiable, with derivatives given by

$$\begin{aligned} {\textstyle {{\frac{d{}}{d{s}}}}} \left[ {\nabla }_p \bar{J}_T(s,\bar{x}_s, \bar{p}_s) \right]&= -{\mathcal {M}}^{-1} \left( \bar{p}_s - {\nabla }_x \bar{J}_T(s,\bar{x}_s,\bar{p}_s) \right) , \end{aligned}$$
(65)
$$\begin{aligned} {\textstyle {{\frac{d{}}{d{s}}}}} \left[ {\nabla }_x \bar{J}_T(s,\bar{x}_s,\bar{p}_s) \right]&= {\nabla }V(\bar{x}_s) - {\nabla }^2 V(\bar{x}_s)\, {\nabla }_p \bar{J}_T(s,\bar{x}_s, \bar{p}_s), \end{aligned}$$
(66)

for all \(s\in (t,T)\). Moreover, \(s\mapsto {\nabla }_p \bar{J}_T(s,\bar{x}_s, \bar{p}_s)\) is twice continuously differentiable, and satisfies

$$\begin{aligned} 0&= {\textstyle {{\frac{d^2{}}{d{s}^2}}}} \left[ {\nabla }_p \bar{J}_T(s,\bar{x}_s, \bar{p}_s) \right] + {\mathcal {M}}^{-1}\, {\nabla }^2 V(\bar{x}_s)\, {\nabla }_p \bar{J}_T(s,\bar{x}_s, \bar{p}_s), \end{aligned}$$
(67)

for all \(s\in (t,T)\).

Proof

Fix \(T\in {\mathbb {R}}_{>0}\), \(x,p\in {\mathscr {X}}\), and let \((\bar{x}_s, \bar{p}_s)\in {\mathscr {X}}^2\), \(s\in [t,T]\), be as per the lemma statement. Fix \(h\in {\mathscr {X}}\). Applying Proposition 1, \((s,x,p)\mapsto \bar{J}_T(s,x,p)\) is twice continuously differentiable, and the order of differentiation may be swapped. In particular,

$$\begin{aligned}&{\textstyle {{\frac{d{}}{d{s}}}}} \, [ D_p \, \bar{J}_T(s, \bar{x}_s, \bar{p}_s)\, h] \nonumber \\&= {\textstyle {{\frac{\partial {}}{\partial {s}}}}}\, [D_p \, \bar{J}_T(s,\bar{x}_s, \bar{p}_s)\, h] + D_x\, [D_p \, \bar{J}_T(s,\bar{x}_s,\bar{p}_s)\, h]\, \dot{\bar{x}}_s + D_p \, [D_p \, \bar{J}_T(s,\bar{x}_s, \bar{p}_s)\, h]\, \dot{\bar{p}}_s \nonumber \\&= \left( D_p \, {\textstyle {{\frac{\partial {}}{\partial {s}}}}}\, \bar{J}_T(s,\bar{x}_s, \bar{p}_s) + D_x\, D_p \, \bar{J}_T(s,\bar{x}_s,\bar{p}_s) \, \dot{\bar{x}}_s + D_p \, D_p \, \bar{J}_T(s,\bar{x}_s, \bar{p}_s)\, \dot{\bar{p}}_s \right) h. \end{aligned}$$
(68)

Meanwhile, \(\bar{J}_T\) satisfies (41) by Theorem 5, i.e.

$$\begin{aligned} 0&= -{\textstyle {{\frac{\partial {}}{\partial {s}}}}} \bar{J}_T(s,x,p) - {{\textstyle {\frac{1}{2}}}}\, \langle p, \, {\mathcal {M}}^{-1}\, p \rangle + V(x) + D_x \bar{J}_T(s,x,p)\, {\mathcal {M}}^{-1}\, p - D_p \bar{J}_T(s,x,p)\, {\nabla }V(x), \end{aligned}$$
(69)

for all \(s\in (t,T)\), \(x,p\in {\mathscr {X}}\). Differentiating (69) with respect to p,

$$\begin{aligned} 0&= - D_p ( {\textstyle {{\frac{\partial {}}{\partial {s}}}}} \bar{J}_T(s,x,p))\, h - \langle {\mathcal {M}}^{-1}\, p, \, h \rangle + D_p\, (D_x\, \bar{J}_T(s,x,p))\, h\, {\mathcal {M}}^{-1}\, p \\&\qquad + D_x\, \bar{J}_T(s,x,p)\, {\mathcal {M}}^{-1}\, h - D_p\, (D_p\, \bar{J}_T(s,x,p)\, {\nabla }V(x))\, h \\&= - \langle {\mathcal {M}}^{-1}\, (p - {\nabla }_x \bar{J}_T(s,x,p)), \, h \rangle \\&\qquad -\left( D_p \, {\textstyle {{\frac{\partial {}}{\partial {s}}}}} \bar{J}_T(s,x,p) - D_x\, D_p\, \bar{J}_T(s,x,p)\, {\mathcal {M}}^{-1}\, p + D_p\, D_p\, \bar{J}_T(s,x,p)\, {\nabla }V(x) \right) h. \end{aligned}$$

Evaluating along the trajectory \(s\mapsto (\bar{x}_s,\bar{p}_s)\) corresponding to \({\overline{X}}(Y_p(x))\), i.e. as per (20), yields

$$\begin{aligned}&\left( D_p \, {\textstyle {{\frac{\partial {}}{\partial {s}}}}} \bar{J}_T(s,\bar{x}_s, \bar{p}_s) + D_x\, D_p\, \bar{J}_T(s,\bar{x}_s, \bar{p}_s)\, \dot{\bar{x}}_s + D_p\, D_p\, \bar{J}_T(s,\bar{x}_s, \bar{p}_s)\, \dot{\bar{p}}_s \right) h \\&\quad = - \langle {\mathcal {M}}^{-1}\, (\bar{p}_s - {\nabla }_x \bar{J}_T(s,\bar{x}_s, \bar{p}_s)), \, h \rangle . \end{aligned}$$

Substitution in (68) subsequently yields

$$\begin{aligned} \langle {\textstyle {{\frac{d{}}{d{s}}}}} [ {\nabla }_p \bar{J}_T(s,\bar{x}_s,\bar{p}_s) ], \, h \rangle = {\textstyle {{\frac{d{}}{d{s}}}}} \, [ D_p \, \bar{J}_T(s, \bar{x}_s, \bar{p}_s)\, h ]&= - \langle {\mathcal {M}}^{-1}\, (\bar{p}_s - {\nabla }_x \bar{J}_T(s,\bar{x}_s, \bar{p}_s)), \, h \rangle . \end{aligned}$$
(70)

Recalling that \(h\in {\mathscr {X}}\) is arbitrary immediately yields (65).

Similarly, for (66), observe that

$$\begin{aligned}&{\textstyle {{\frac{d{}}{d{s}}}}} [ D_x \bar{J}_T(s,\bar{x}_s,\bar{p}_s)\, h] = {\textstyle {{\frac{\partial {}}{\partial {s}}}}} [D_x \bar{J}_T(s,\bar{x}_s,\bar{p}_s)\, h] + D_x\, [ D_x \bar{J}_T(s,\bar{x}_s,\bar{p}_s)\, h]\, \dot{\bar{x}}_s \nonumber \\&\quad + D_p\, [ D_x \bar{J}_T(s,\bar{x}_s,\bar{p}_s)\, h]\, \dot{\bar{p}}_s \nonumber \\&= \left( D_x\, {\textstyle {{\frac{\partial {}}{\partial {s}}}}} \bar{J}_T(s,\bar{x}_s,\bar{p}_s) + D_x\, D_x \bar{J}_T(s,\bar{x}_s,\bar{p}_s)\, \dot{\bar{x}}_s + D_p\, D_x \bar{J}_T(s,\bar{x}_s,\bar{p}_s)\, \dot{\bar{p}}_s \right) h. \end{aligned}$$
(71)

Differentiating (69) with respect to x,

$$\begin{aligned} 0&= -D_x ( {\textstyle {{\frac{\partial {}}{\partial {s}}}}} \bar{J}_T(s,x,p) )\, h + D_x V(x)\, h + D_x \, ( D_x \bar{J}_T(s,x,p))\, h \, {\mathcal {M}}^{-1}\, p \\&\quad - D_x \, ( D_p \bar{J}_T(s,x,p)) \, h \, {\nabla }V(x) - D_p \bar{J}_T(s,x,p)\, D_x{\nabla }V(x)\, h \\&= - \left( D_x \, {\textstyle {{\frac{\partial {}}{\partial {s}}}}} \bar{J}_T(s,x,p) + D_x\, D_x \bar{J}_T(s,x,p)\, (-{\mathcal {M}}^{-1}\, p) + D_p\, D_x \bar{J}_T(s,x,p)\, {\nabla }V(x) \right) h \\&\quad + \left( D_x V(x) - D_p \bar{J}_T(s,x,p)\, D_x {\nabla }V(x) \right) h \end{aligned}$$

Evaluating along the trajectory \(s\mapsto (\bar{x}_s,\bar{p}_s)\) corresponding to \({\overline{X}}(Y_p(x))\), i.e. as per (20), yields

$$\begin{aligned}&\left( D_x {\textstyle {{\frac{\partial {}}{\partial {s}}}}} \bar{J}_T(s,\bar{x}_s, \bar{p}_s) + D_x\, D_x \bar{J}_T(s,\bar{x}_s, \bar{p}_s)\, \dot{\bar{x}}_s) + D_p\, D_x \bar{J}_T(s,\bar{x}_s, \bar{p}_s)\, \dot{\bar{p}}_s \right) h \\&\quad = \langle {\nabla }V(\bar{x}_s) - {\nabla }^2 V(\bar{x}_s)\, {\nabla }_p \bar{J}_T(s,\bar{x}_s,\bar{p}_s), \, h \rangle . \end{aligned}$$

Substitution in (71) subsequently yields

$$\begin{aligned} \langle {\textstyle {{\frac{d{}}{d{s}}}}} [ {\nabla }_x \bar{J}_T(s,\bar{x}_s,\bar{p}_s)], \, h \rangle&= \langle {\nabla }V(\bar{x}_s) - {\nabla }^2 V(\bar{x}_s)\, {\nabla }_p \bar{J}_T(s,\bar{x}_s,\bar{p}_s), \, h \rangle . \end{aligned}$$

Recalling that \(h\in {\mathscr {X}}\) is arbitrary immediately yields (66).

The remaining assertion regarding twice differentiability is immediate by inspection of (65), (66), with

$$\begin{aligned} {\textstyle {{\frac{d^2{}}{d{s}^2}}}} [ {\nabla }_p \bar{J}_T(s,\bar{x}_s,\bar{p}_s) ]&= -{\mathcal {M}}^{-1} \left( \dot{\bar{p}}_s - {{{\textstyle {{\frac{d{}}{d{s}}}}}}} [ {{{\nabla }_x}} \bar{J}_T(s,\bar{x}_s,\bar{p}_s) ] \right) \\&= -{\mathcal {M}}^{-1} \left( {\nabla }V(\bar{x}_s) - [ {\nabla }V(\bar{x}_s) - {\nabla }^2 V(\bar{x}_s) \, {\nabla }_p \bar{J}_T(s,\bar{x}_s,\bar{p}_s)] \right) \\&= -{\mathcal {M}}^{-1} \, {{{\nabla }^2 V(\bar{x}_s)}} \, {\nabla }_p \bar{J}_T(s,\bar{x}_s,\bar{p}_s), \end{aligned}$$

as required. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Basco, V., Dower, P.M., McEneaney, W.M. et al. Exploiting Characteristics in Stationary Action Problems. Appl Math Optim 84 (Suppl 1), 733–765 (2021). https://doi.org/10.1007/s00245-021-09784-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00245-021-09784-6

Keywords

Mathematics Subject Classification

Navigation