Optimal Control

๐Ÿ‘ค Sรฉbastien Boisgรฉrault

Control Engineering with Python

Symbols

๐Ÿ Code ๐Ÿ” Worked Example
๐Ÿ“ˆ Graph ๐Ÿงฉ Exercise
๐Ÿท๏ธ Definition ๐Ÿ’ป Numerical Method
๐Ÿ’Ž Theorem ๐Ÿงฎ Analytical Method
๐Ÿ“ Remark ๐Ÿง  Theory
โ„น๏ธ Information ๐Ÿ—๏ธ Hint
โš ๏ธ Warning ๐Ÿ”“ Solution

๐Ÿ Imports

from numpy import *
from numpy.linalg import *
from matplotlib.pyplot import *
from scipy.integrate import solve_ivp
from scipy.linalg import solve_continuous_are

Why Optimal Control?

Limitations of Pole Assignment

  • It is not always obvious what set of poles we should target (especially for large systems),

  • We do not control explicitly the trade-off between โ€œspeed of convergenceโ€ and โ€œintensity of the controlโ€ (large input values maybe costly or impossible).

Let

\[\dot{x} = A x + Bu\]

where

  • \(A \in \mathbb{R}^{n\times n}\), \(B \in \mathbb{R}^{m\times n}\) and

  • \(x(0) = x_0 \in \mathbb{R}^n\) is given.

Find \(u(t)\) that minimizes

\[ J = \int_0^{+\infty} x(t)^t Q x(t) + u(t)^t R u(t) \, dt \]

where:

  • \(Q \in \mathbb{R}^{n \times n}\) and \(R \in \mathbb{R}^{m\times m}\),

  • (to be continued โ€ฆ)

  • \(Q\) and \(R\) are symmetric (\(R^t = R\) and \(Q^t = Q\)),

  • \(Q\) and \(R\) are positive definite (denoted โ€œ\(>0\)โ€)

    \[x^t Q x \geq 0 \, \mbox{ and } \, x^t Q x = 0 \, \mbox{ iff }\, x=0\]

    and

    \[u^t R u \geq 0 \, \mbox{ and } \, u^t R u = 0 \, \mbox{ iff }\, u=0.\]

Heuristics / Scalar Case

If \(x \in \mathbb{R}\) and \(u \in \mathbb{R}\),

\[ J = \int_0^{+\infty} q x(t)^2 + r u(t)^2 \, dt \]

with \(q > 0\) and \(r > 0\).

When we minimize \(J\):

  • Only the relative values of \(q\) and \(r\) matters.

  • Large values of \(q\) penalize strongly non-zero states:

    \(\Rightarrow\) fast convergence.

  • Large values of \(r\) penalize strongly non-zero inputs:

    \(\Rightarrow\) small input values.

Heuristics / Vector Case

If \(x \in \mathbb{R}^n\) and \(u \in \mathbb{R}^m\) and \(Q\) and \(R\) are diagonal,

\[ Q = \mathrm{diag}(q_1, \cdots, q_n), \; R=\mathrm{diag}(r_1, \cdots, r_m), \]

\[ J = \int_0^{+\infty} \sum_{i} q_i x_i(t)^2 + \sum_j r_j u_j(t)^2 \, dt \]

with \(q_i > 0\) and \(r_j > 0\).

Thus we can control the cost of each component of \(x\) and \(u\) independently.

๐Ÿ’Ž Optimal Solution

Assume that \(\dot{x} = A x + Bu\) is controllable.

  • There is an optimal solution; it is a linear feedback

    \[u = - K x\]

  • The closed-loop dynamics is asymptotically stable.

๐Ÿ’Ž Algebraic Riccati Equation

  • The gain matrix \(K\) is given by

    \[ K = R^{-1} B^t \Pi, \]

    where \(\Pi \in \mathbb{R}^{n \times n}\) is the unique matrix such that \(\Pi^t = \Pi\), \(\Pi > 0\) and

    \[ \Pi B R^{-1} B^t \Pi - \Pi A - A^t \Pi - Q = 0. \]

๐Ÿ” Optimal Control

Consider the double integrator \(\ddot{x} = u\)

\[ \frac{d}{dt} \left[\begin{array}{c} x \\ \dot{x} \end{array}\right] = \left[\begin{array}{cx} 0 & 1 \\ 0 & 0\end{array}\right] \left[\begin{array}{c} x \\ \dot{x} \end{array}\right] + \left[\begin{array}{c} 0 \\ 1 \end{array}\right] u \]

(in standard form)

๐Ÿ Problem Data

A = array([[0, 1], [0, 0]])
B = array([[0], [1]])
Q = array([[1, 0], [0, 1]])
R = array([[1]])

๐Ÿ Optimal Gain

Pi = solve_continuous_are(A, B, Q, R)
K = inv(R) @ B.T @ Pi

๐Ÿ Closed-Loop Behavior

It is stable:

eigenvalues, _ = eig(A - B @ K)
assert all([real(s) < 0 for s in eigenvalues])

๐Ÿ“ˆ Eigenvalues Location

figure()
x = [real(s) for s in eigenvalues]
y = [imag(s) for s in eigenvalues]
plot(x, y, "kx")
grid(True)
title("Eigenvalues")
axis("square")
axis([-5, 5, -5, 5])

๐Ÿ Simulation

y0 = [1.0, 1.0]
def f(t, x):
    return (A - B @ K) @ x

๐Ÿ Simulation

result = solve_ivp(
  f, t_span=[0, 10], y0=y0, max_step=0.1
)
t = result["t"]
x1 = result["y"][0]
x2 = result["y"][1]
u = - (K @ result["y"]).flatten() # vect. -> scalar

๐Ÿ“ˆ Input & State Evolution

figure()
plot(t, x1, label="$x_1$")
plot(t, x2, label="$x_2$")
plot(t, u, label="$u$")
xlabel("$t$")
grid(True)
legend(loc="lower right")

๐Ÿ Optimal Gain

Q = array([[10, 0], [0, 10]])
R = array([[1]])
Pi = solve_continuous_are(A, B, Q, R)
K = inv(R) @ B.T @ Pi

๐Ÿ Closed-Loop Asymp. Stab.

eigenvalues, _ = eig(A - B @ K)
assert all([real(s) < 0 for s in eigenvalues])

๐Ÿ“ˆ Eigenvalues Location

figure()
x = [real(s) for s in eigenvalues]
y = [imag(s) for s in eigenvalues]
plot(x, y, "kx")

๐Ÿ Simulation

result = solve_ivp(
  f, t_span=[0, 10], y0=y0, max_step=0.1
)
t = result["t"]
x1 = result["y"][0]
x2 = result["y"][1]
u = - (K @ result["y"]).flatten() # vect. -> scalar

๐Ÿ“ˆ Input & State Evolution

figure()
plot(t, x1, label="$x_1$")
plot(t, x2, label="$x_2$")
plot(t, u, label="$u$")
xlabel("$t$")
grid(True)
legend(loc="lower right")

๐Ÿ Optimal Gain

Q = array([[1, 0], [0, 1]])
R = array([[10]])
Pi = solve_continuous_are(A, B, Q, R)
K = inv(R) @ B.T @ Pi

๐Ÿ Closed-Loop Asymp. Stab.

eigenvalues, _ = eig(A - B @ K)
assert all([real(s) < 0 for s in eigenvalues])

๐Ÿ“ˆ Eigenvalues Location

figure()
x = [real(s) for s in eigenvalues]
y = [imag(s) for s in eigenvalues]
plot(x, y, "kx")

๐Ÿ Simulation

result = solve_ivp(
  f, t_span=[0, 10], y0=y0, max_step=0.1
)
t = result["t"]
x1 = result["y"][0]
x2 = result["y"][1]
u = - (K @ result["y"]).flatten() # vect. -> scalar

๐Ÿ“ˆ Input & State Evolution

figure()
plot(t, x1, label="$x_1$")
plot(t, x2, label="$x_2$")
plot(t, u, label="$u$")
xlabel("$t$")
grid(True)
legend(loc="lower right")

๐Ÿงฉ Optimal Value

Consider the controllable dynamics

\[ \dot{x} = A x + Bu \]

and \(u(t)\) the control that minimizes

\[ J = \int_{0}^{+\infty} x(t)^t Q x(t) + u(t)^t R u(t) \, dt. \]

1 ๐Ÿงฎ.

Let

\[ j(x, u) := x^tQ x + u^t R u. \]

Show that

\[ j(x(t), u(t)) = - \frac{d}{dt} x(t)^t \Pi x(t) \]

2. ๐Ÿงฎ

What is the value of \(J\)?

๐Ÿ”“ Optimal Value

1. ๐Ÿ”“

We know that \(u = -Kx\) where \(K = R^{-1} B^t \Pi\) and \(\Pi\) is a symmetric solution of

\[ \Pi BR^{-1} B^t \Pi - \Pi A - A^t \Pi - Q = 0. \]

Since \(R\) is symmetric,

\[ \Pi BR^{-1} B^t \Pi = \Pi B(R^{-1})^tR R^{-1} B^t \Pi = K^t R K \]

and thus \[\Pi A + A^t \Pi = K^t R K - Q.\]

Since \(\dot{x} = (A - B K ) x\),

\[ \begin{split} \frac{d}{dt} x^t \Pi x &= x^t (\Pi (A - BK) + (A - BK)^t \Pi) x \\ & = x^t (\Pi A + A^t \Pi - \Pi BK - (BK)^t \Pi) x \\ & = x^t ( K^t R K - Q - K^t R K - K^t R K) x \\ & = x^t (- Q - K^t R K ) x^t \\ & = - x^t Q x - u^t R u \\ & = -j(x, u). \end{split} \]

2. ๐Ÿ”“

Since the system is controllable, the optimal control makes the origin of the closed-loop system asymptotically stable. Consequently, \(x(t) \to 0\) when \(t \to +\infty\). Hence,

\[ \begin{split} J &= \int_0^{+\infty} j(x, u) \, dt \\ &= - \int_0^{+\infty} \frac{d}{dt} x^t \Pi x \, dt \\ &= - \left[x^t \Pi x\right]_0^{+\infty} \\ & = x(0)^t \Pi x(0). \end{split} \]