Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • rli/robotics-codes-from-scratch
1 result
Show changes
Commits on Source (2)
No preview for this file type
......@@ -1953,6 +1953,163 @@ In addition, the LQT formulation allows the resulting controller to be formalize
Moreover, the above problem formulation can be extended to iterative LQR, providing an opportunity to consider obstacle avoidance and constraints within the DMP formulation, as well as to describe costs in task space with a DMP acting in joint angle space.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{LQR with infinite time horizon}
\begin{flushright}
\filename{LQR\_infHor.*}
\end{flushright}
%https://math.stackexchange.com/questions/3119575/how-can-i-solve-the-discrete-algebraic-riccati-equations
%A LQR problem with finite time horizon $T$ and homogeneous matrices $\bm{A}$, $\bm{B}$, $\bm{Q}$ and $\bm{R}$ corresponds to the minimization problem
%\begin{equation}
% \min_{\bm{u}} \sum_{t=1}^T \bm{x}_t^\trsp \bm{Q} \bm{x}_t + \bm{u}_t^\trsp \bm{R} \bm{u}_t
% \quad\text{s.t.}\quad
% \bm{x}_{t+1} = \bm{A} \bm{x}_t + \bm{B}\bm{u}_t,
%\end{equation}
%which requires the recursive computation of
%\begin{equation}
% \bm{P}_{t\!-\!1} = \bm{A}^\trsp \bm{P}_t \bm{A}
% + \bm{Q} - \bm{A}^\trsp \bm{P}_t \bm{B} {( \bm{B}^\trsp \bm{P}_t \bm{B} + \bm{R})}^{-1} \bm{B}^\trsp \bm{P}_t \bm{A},
% \label{eq:PT}
%\end{equation}
%to find the controller
%\begin{align}
% \bm{\hat{u}}_{t\!-\!1} &= -\bm{K}_{t\!-\!1} \, \bm{x}_{t\!-\!1},\\
% \text{with}\quad
% \bm{K}_{t\!-\!1} &= {( \bm{B}^\trsp \bm{P}_t \bm{B} + \bm{R})}^{-1} \bm{B}^\trsp \bm{P}_t \bm{A}.
%\end{align}
%In this problem, $\bm{P}_t$ converges to a steady state after a few backward recursions, starting from $\bm{P}_T$.
We have seen in Section \ref{sec:LQRrecursive} that dynamic programming could be used to reduce the minimization in \eqref{eq:valueFunction} over the entire sequence of control commands $\bm{u}_{t:T-1}$ to a sequence of minimization problems over control commands at a single time step, by proceeding backwards in time. We obtained a backward recursion procedure in which $\bm{V}_t$ is evaluated recursively from $t=T-1$ to $t=1$, starting from $\bm{V}_T=\bm{Q}_T$, which corresponds to a Riccati equation, see Algorithm \ref{alg:LQRrecursive} for the summary of the overall procedure.
An infinite time horizon corresponds to $T\rightarrow\infty$, characterized by the steady state
\begin{equation}
\bm{P}_\infty = \bm{A}^\trsp \bm{P}_\infty \bm{A}
+ \bm{Q} - \bm{A}^\trsp \bm{P}_\infty \bm{B} {( \bm{B}^\trsp \bm{P}_\infty \bm{B} + \bm{R} )}^{-1} \bm{B}^\trsp \bm{P}_\infty \bm{A}.
\end{equation}
The above equation can also be solved through a discrete algebraic Riccati equation (DARE).
By rewriting the cost in \eqref{eq:PT} with a Lagragian formulation
\begin{equation}
c = \sum_{t=1}^T \bm{x}_t^\trsp \bm{Q} \bm{x}_t + \bm{u}_t^\trsp \bm{R} \bm{u}_t
+\bm{\lambda}_t (\bm{A} \bm{x}_t + \bm{B}\bm{u}_t - \bm{x}_{t+1}),
\end{equation}
we can find the minimum by differentiating the cost and equating to zeros, providing
\begin{align}
\frac{\partial c}{\partial\bm{\lambda}_t} &= \bm{A} \bm{x}_t + \bm{B}\bm{u}_t - \bm{x}_{t+1} = \bm{0},\\
\frac{\partial c}{\partial\bm{x}_t} &= \bm{Q} \bm{x}_t + \bm{A}^\trsp \bm{\lambda}_t - \bm{\lambda}_{t-1} = \bm{0},\\
\frac{\partial c}{\partial\bm{u}_t} &= \bm{R} \bm{u}_t + \bm{B}^\trsp \bm{\lambda}_t = \bm{0},
\end{align}
which can be rewritten as
\begin{equation}
\begin{bmatrix} \bm{x}_{t+1} \\ \bm{\lambda}_t \end{bmatrix}
\!=\!
\underbrace{\begin{bmatrix} \bm{A} \!+\! \bm{B} \bm{R}^{-1} \! \bm{B}^\trsp \! \bm{A}^{-\trsp} \bm{Q} & -\bm{B} \bm{R}^{-1} \! \bm{B}^\trsp \! \bm{A}^{-\trsp} \\
-\bm{A}^{-\trsp} \bm{Q} & \bm{A}^{-\trsp} \end{bmatrix}}_{\bm{Z}} \!
\begin{bmatrix} \bm{x}_t \\ \bm{\lambda}_{t-1} \end{bmatrix},
\end{equation}
where $\bm{Z}$ is called a symplectic matrix.\footnote{For a continuous system, the same development would result in a Hamiltonian matrix.}
The above DARE problem has multiple solutions, but only one results in a positive semi-definite $\bm{P}_\infty$. The symplectic matrix $\bm{Z}$ can be decomposed as $\bm{Z}=\bm{V}\bm{D}\bm{V}^\trsp$ (for example, with an eigendecomposition). For systems such as double integrators, it has a structure in which half of the eigenvalues are within the inner circle, which corresponds to the stable solution we are interested in.\footnote{For a continuous system, the corresponding eigenvalues have negative real part.}
If we decompose the matrix $\bm{V}$ containing the eigenvectors as
\begin{equation}
\bm{V} = \begin{bmatrix} \bm{V}_{11} & \bm{V}_{12} \\ \bm{V}_{21} & \bm{V}_{22} \end{bmatrix},
\end{equation}
the steady state solution of the Riccati equation can be described in terms of the corresponding eigenvectors as $\bm{P}_\infty = \bm{V}_{21} \bm{V}_{11}^{-1}$.
%\subsubsection*{Continuous algebraic Riccati equation (CARE)}
%For the infinite horizon setting, the cost
%\begin{equation}
% c = \sum_{t=1}^\infty \bm{x}_t^\trsp \bm{Q} \bm{x}_t + \bm{u}_t^\trsp \bm{R} \bm{u}_t
%\end{equation}
%can be minimized with the algebraic Riccati equation (ARE)
%\begin{gather}
% \bm{A}^\trsp \bm{P} + \bm{P} \bm{A} - \bm{P} \bm{B} \bm{R}^{-1}\!\bm{B}^\trsp\!\bm{P} + \bm{Q} = \bm{0}
% \label{eq:Riccati2}
% \\
% \iff \begin{bmatrix}\bm{P} & -\bm{I} \end{bmatrix}
% \underbrace{\begin{bmatrix} \bm{A} & -\bm{\bm{B} \bm{R}^{-1}\!\bm{B}^\trsp} \\ \bm{-Q} & \bm{-A^\trsp} \end{bmatrix}}_{\bm{H}} \begin{bmatrix}\bm{I} \\ \bm{P} \end{bmatrix}
% = \bm{0} ,\label{eq:Riccati2b}
%\end{gather}
%where $\bm{H}$ is a Hamiltonian matrix.
%Under suitable hypotheses about symmetry, stabilizability and detectability, \eqref{eq:Riccati2} has a unique positive semidefinite solution $\bm{P}$, which can be obtained by several methods. %, see \cite{Laub79} for details.
%The key is to convert the problem to a stable invariant subspace problem of the Hamiltonian matrix, i.e., finding the invariant subspace corresponding to eigenvalues of $\bm{H}$ with negative real parts. This subspace can be found in several ways. The eigendecomposition approach (with ordered eigencomponents) decomposes $\bm{H}$ as
%\begin{equation}
% \bm{H} = \bm{V} \begin{bmatrix} \bm{\lambda}_1 & \bm{0} \\ \bm{0} & \bm{\lambda}_2 \end{bmatrix} \bm{V}^\trsp ,
% \quad\mathrm{with}\quad
% \bm{V} = \begin{bmatrix} \bm{V}_{11} & \bm{V}_{12} \\ \bm{V}_{21} & \bm{V}_{22} \end{bmatrix} ,
% \label{eq:HEig}
%\end{equation}
%where $\begin{bmatrix}\bm{V}_{11}\\\bm{V}_{21}\end{bmatrix}$ forms the stable eigenspace of $\bm{H}$. We have that $\bm{H}\in\mathbb{R}^{4D\times 4D}$ for a double integrator as in \eqref{eq:doubleIntegrator}, which precisely has $2D$ eigenvalues with negative real parts. Together with \eqref{eq:Riccati2b}, it provides the ARE solution
%\begin{equation}
% \bm{P} = \bm{V}_{21} \bm{V}_{11}^{-1} .
% \label{eq:PEig}
%\end{equation}
%%The eigenvector approach can be numerically unstable when the Hamiltonian matrix is close to defective, as can occur some cases in which H is close to a matrix whose Jordan form has ones above its main diagonal. In this case, the matrix $\bm{V}_1$ will be ill-conditioned. However, the ill-conditioning of $\bm{V}_1$ is independent of the conditioning of the Riccati equation.
%%%\cite{Laub79,Paige81}.
%%
%%To circumvent the ill-conditioning problems associated with a defective Hamiltonian matrix, the same stable eigenspace of Hamiltonian can be spanned by Schur vectors. % \cite{Laub79}.
%%In this case, an orthogonal transformation $\bm{U}$ is found to reduce $\bm{H}$ to real Schur form
%%\begin{equation}
%% \bm{H} = \bm{U} \begin{bmatrix} \bm{T}_1 & \bm{T}_{12} \\ \bm{0} & \bm{T}_2 \end{bmatrix} \bm{U}^\trsp ,\quad
%% \mathrm{with}\quad
%% \bm{U} = \begin{bmatrix} \bm{U}_1 & \bm{U}_{12} \\ \bm{U}_{21} & \bm{U}_2 \end{bmatrix} ,
%% \label{eq:HSchur}
%%\end{equation}
%%ordered such that the eigenvalues of $\bm{T}_1$ are stable and those of $\bm{T}_2$ are unstable. This can be achieved with additional orthogonal transformations reordering the Schur form such that the negative eigenvalues precede the non-negative eigenvalues on the diagonal of the upper triangular matrices.
%%
%%In the above, $\begin{bmatrix}\bm{U}_1\\\bm{U}_{21}\end{bmatrix}$ spans for the same stable eigenspace of $\bm{H}$ as in \eqref{eq:HEig}, providing with \eqref{eq:Riccati2b} the ARE solution
%%\begin{equation}
%% \bm{P} = \bm{U}_{21} \bm{U}_1^{-1} .
%% \label{eq:PSchur}
%%\end{equation}
%%
%%Both \eqref{eq:PEig} and \eqref{eq:PSchur} solve \eqref{eq:Riccati2} with $\bm{P}=\bm{P}^\trsp$ positive definite.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection*{Ideal damping ratio with DARE}
As an example of application, in the case of diagonal stiffness and damping matrices, it is easy to show that the stiffness-damping ratio estimated by DARE corresponds to an ideal damping ratio.
%, see for example the supplementary notes from \cite{Astrom08}.
%
%\begin{equation}
% \begin{bmatrix}\dot{e}\\\ddot{e}\end{bmatrix}
% =
% \overbrace{\begin{bmatrix}0&1\\0&0\end{bmatrix}}^{\bm{A}}
% \begin{bmatrix}e\\\dot{e}\end{bmatrix}
% +
% \overbrace{\begin{bmatrix}0\\1\end{bmatrix}}^{\bm{B}} u,
% \label{eq:doubleIntegrator}
%\end{equation}
By defining a double integrator system with a state composed of 1D position and 1D velocity, and by setting the weighting terms
\begin{equation}
\bm{Q}=\begin{bmatrix}c_1&0\\0&0\end{bmatrix}
\;,\quad R=c_2 ,
\end{equation}
with $c_1$ and $c_2$ constant scalars, the optimal feedback law found by the system is described by $u=-\kappa^\ty{P}e-\kappa^\ty{V}\dot{e}$, with $\kappa^\ty{V}=\sqrt{2\kappa^\ty{P}}$, which corresponds to the ideal underdamped damping ratio $x=\frac{\kappa^\ty{V}}{2\sqrt{\kappa^\ty{P}}} = \frac{1}{\sqrt{2}} \approx 0.7071$ in control design, optimizing the reaction speed while still allowing the system to asymptotically come back to its rest position.
Similarly, for multivariate problems, the stiffness and damping matrices estimated by the system automatically satisfy the relation
\begin{equation}
\bm{\hat{K}}^\ty{V} = \bm{V}(2\bm{D})^{\nicefrac{1}{2}}\bm{V}^\trsp,
\end{equation}
with $\bm{\hat{K}}^\ty{P}=\bm{V}\bm{D}\bm{V}^\trsp$ the eigendecomposition of $\bm{\hat{K}}^\ty{P}$, with unit eigenvectors represented as a matrix $\bm{V}$, and the squared eigenvalues represented in a diagonal matrix $\bm{D}$.
\newpage
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
......
......@@ -12,7 +12,7 @@ function iLQR_manipulator
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
param.dt = 1E-2; %Time step length
param.nbData = 50; %Number of datapoints
param.nbIter = 300; %Maximum number of iterations for iLQR
param.nbMaxIter = 100; %Maximum number of iterations for iLQR
param.nbPoints = 2; %Number of viapoints
param.nbVarX = 3; %State space dimension (x1,x2,x3)
param.nbVarU = 3; %Control space dimension (dx1,dx2,dx3)
......@@ -48,7 +48,7 @@ Su0 = [zeros(param.nbVarX, param.nbVarX*(param.nbData-1)); tril(kron(ones(param.
Sx0 = kron(ones(param.nbData,1), eye(param.nbVarX));
Su = Su0(idx,:);
for n=1:param.nbIter
for n=1:param.nbMaxIter
x = reshape(Su0 * u + Sx0 * x0, param.nbVarX, param.nbData); %System evolution
[f, J] = f_reach(x(:,tl), param); %Residuals and Jacobians
du = (Su' * J' * Q * J * Su + R) \ (-Su' * J' * Q * f(:) - u * param.r); %Gauss-Newton update
......