Deterministic Optimal Control#

An optimal control problem deals with finding a set of controls for a dynamical system over a period of time (a horizon) [Wikipediacontributors23]. The goal of the problem is described in an objective function. By “optimal” we mean minimizing (or maximising) this objective function with respect to the controls.

On most robots a control policy can be run over and over and we expect the same result. Even though the system has uncertainty or noise, it is not overarching. This is what we mean by “Deterministic”. In contrast, “Stochastic Optimal Control” deals with problems where noise and uncertainty significantly affect state transitions.

Continuous Time#

The continuous time optimal control problem is described as

\[\begin{split} \begin{aligned} & \min _{x(t), u(t)} \underbrace{J(x(t), u(t))}_{\begin{array}{c} \text { objective function} \\ \end{array}}=\min _{x(t) u(t)} \int_{t_0}^{t_t} \underbrace{\mathcal{L}(x(t), u(t))}_{\text {stage cost}} dt+\underbrace{\mathcal{L}_F(x(t), u(t))}_{\text {terminal cost}} \\ & \quad \text { s.t. } \\ & \qquad \dot{x}(t)=\underbrace{f(x(t), u(t))}_{\text {dynamics }} \end{aligned} \end{split}\]

The solution to this is an open-loop trajectory of states \(x(t)\) and controls \(u(t)\).

Now, running an uncertain system with and open-loop trajectory and no tracking controller is not the best idea. Compounding error will cause the system to deviate from the trajectory. Some optimal control algorithms result in trajectory of feedback policies (controllers). Describing the problem rather than the details of tracking a trajectory (we do this later).

There are a few optimal control problems with analytics solutions, however, we focus on solving descrete time problems that are tractable on modern computers.

Discrete Time#

\[\begin{split} \begin{aligned} &\min _{\substack{x_{1 : N} \\ u_{1 : N-1}}} J\left(x_{1: N}, u_{1: N-1}\right)=\min _{\substack{x_{1 : N} \\ u_{1 : N-1}}} \sum_{k=1}^{N-1} \left[ \mathcal{L}\left(x_k, u_k\right)\right] +\mathcal{L}_F\left(x_N\right) \\ & \quad \text{s.t.} \\ & \qquad x_{k+1}=f\left(x_k, u_k\right) \qquad \text{(dynamics constraints)}\\ & \qquad u_{\min k } \leq u_k \leq u_{\max k} \qquad \text{(control limits)}\\ & \qquad c_k\left(x_k\right) \leq 0 \qquad \text{(state constraints)} \end{aligned} \end{split}\]

Some notes about this:

  • Now this is discrete and finite dimensional.

  • The sample points \(x_k\) and \(u_k\) are often called “knot points”.

  • The state transition function \(f(x_k,u_k)\) is different from the continuous time dynamics \(f(x(t), u(t))\)

  • \(f(x_k,u_k)\) is usually obtained by applying a discrete integrator (eg Runge-Kutta)

  • We can convert our discrete trajectory to a continuous domain with interpolation