diff --git a/lecture_1/ICR.jpg b/lecture_1/ICR.jpg new file mode 100644 index 0000000..d9f6c4b Binary files /dev/null and b/lecture_1/ICR.jpg differ diff --git a/lecture_1/main.tex b/lecture_1/main.tex index edfb6dd..5928563 100644 --- a/lecture_1/main.tex +++ b/lecture_1/main.tex @@ -247,11 +247,11 @@ \subsubsection*{"Unicycle" (a.k.a. "differential drive") Model:} \subsubsection*{"Tricycle" (or simple car) Model:} -Unlike the unicycle model, this configuration type utilizes two rear standard wheels and an active steering wheel(s) in the front of the vehicle. The use of both differential drive and active steering allow the tricycle model to naturally model the kinematic constraints of a simple car\cite{sns}. +Unlike the unicycle model, this configuration type utilizes two rear standard wheels and an active steering wheel(s) in the front of the vehicle. The tricycle model also has maneuverability of two, arising from the control of its linear speed (through it rear wheels) and the steering angle \cite{sns}. Note that a tricycle (or simple car) model does not have differential drive capability, and hence this loss in mobility is compensated by allowing steering angle control. \begin{figure}[H] \centering -\includegraphics[width=0.4\textwidth]{tricycle} +\includegraphics[width=0.5\textwidth]{tricycle} \caption{Tricycle Model} \end{figure} @@ -263,9 +263,29 @@ \subsubsection*{"Tricycle" (or simple car) Model:} \begin{pmatrix} \cos \xi \cos \theta \\ \cos \xi \sin \theta \\ \sin \xi \\ 0 \end{pmatrix} u_1 + \begin{pmatrix} 0 \\ 0 \\ 0 \\ 1 \end{pmatrix} u_2 \] -\textbf{Warning:} A kinematic state space model should only be interpreted as a subsystem of a more general dynamical model of the robot. +The control inputs $(u_1, u_2)$ are the linear speed and the time derivative of the steering angle respectively. + + +\subsubsection*{Understanding Constraints: Instantaneous Axis of Rotation:} + + +To understand the constraints put on different kind of robot models, we may analyze the instantaneous centre of rotation (ICR) of the robot. The constraint on lateral motion of a wheel can be represented by a \textit{zero motion line} that extends perpendicular to the plane of the wheel passing through its centre i.e. the horizontal axis. + +\begin{figure}[H] +\centering +\includegraphics[width=0.6\textwidth]{ICR} +\caption{Instantaneous Centre of Rotation (ICR) for a differential drive robot (a) can lie anywhere along the line passing through the axes of its wheels. For a car-like robot with Ackermann steering (b), the ICR lies at the intersection of the \textit{zero motion lines} of the wheels. } +\label{fig:ICR} +\end{figure} + +In Figure \ref{fig:ICR}, the differential drive robot can adjust its ICR by varying the individual speeds of its wheels. For the car-like robot, this can be achieved by changing the steering angle of the front wheels. The distance between the ICR and the rigid body is the radius of rotational motion. Note that the ICR lies at infinity when the robot is executing pure linear motion. + +Thus, we can see that despite having an extra set of lateral motion constraints, the mobility (dimensionality of the space that the robot can reach) of a car-like robot remains the same as a differential drive robot owing to the steering angle control. + + +\textbf{Warning:} A kinematic state space model should only be interpreted as a subsystem of a more general dynamical model of the robot. The robot is commanded by controlling the actual forces and torques, not linear and angular speeds. Usually we assume that there is a black box that converts desired angular and linear velocities to actual torques on each wheel. Alternatively, we could reason directly with the dynamical model, but we won't do this until later in the class. + -You control the robot by controlling actual forces and torques, not linear and angular speeds. Usually we assume that there is a black box that converts desired angular and linear velocities to actual torques on each wheel. Alternatively, we could reason directly with the dynamical model, but we won't do this until later in the class. \subsection*{Simplified car models} @@ -298,9 +318,11 @@ \subsection*{Simplified car models} $ $ -v= u_{1,max}\;\textup{cos}(\zeta_{max}),\;|w|\leq |v|\;\textup{tan}(\zeta_{max}) \rightarrow \textup{Dubin's car} +v= u_{1,max}\;\textup{cos}(\zeta_{max}),\;|w|\leq |v|\;\textup{tan}(\zeta_{max}) \rightarrow \textup{Dubins' car} $ +In the Reeds & Shepp's car model, we simplify the kinematic model by assuming that the the car can move only with a constant linear speed. The Dubins' model further restricts this motion to only forward movement (positive value of linear speed). This simplifies the task of motion planning for car-like systems \cite{lavalle}. + \section*{Further Readings and Additional Resources} \begin{itemize} @@ -334,10 +356,14 @@ \section*{Further Readings and Additional Resources} \bibitem{swed} L. Lin, H. Shih. \textit{Modeling and Adaptive Control of an Omni-Mecanum-Wheeled Robot.} \\Intelligent Control and Automation, 4, 166-179. doi: 10.4236/ica.2013.42021. +\bibitem{lavalle} +S. M. LaValle. \textit{Planning Algorithms}. Cambridge University Press, Cambridge, U.K, 2006. + + \end{thebibliography} \subsubsection*{Contributors} -Winter 2019: S. Bernardon, T. DeWeese, A. Jagesten, P. Pipatpinyopong, G. Spellman, J. Malmstrom, M. Ricks, and V. Zhang +Winter 2019: S. Bernardon, T. DeWeese, A. Jagesten, P. Pipatpinyopong, G. Spellman, J. Malmstrom, M. Ricks, V. Zhang and V. Nayak \\ Winter 2018: S. Spears, C. Covert, A. Hosler, Z. Chase, and H.M. Ewald diff --git a/lecture_3/Constrainted_Optim1.png b/lecture_3/Constrainted_Optim1.png new file mode 100644 index 0000000..ed72aac Binary files /dev/null and b/lecture_3/Constrainted_Optim1.png differ diff --git a/lecture_3/Constrainted_Optim2.png b/lecture_3/Constrainted_Optim2.png new file mode 100644 index 0000000..a0178ec Binary files /dev/null and b/lecture_3/Constrainted_Optim2.png differ diff --git a/lecture_3/MotionControl.JPG b/lecture_3/MotionControl.JPG new file mode 100644 index 0000000..d8e5887 Binary files /dev/null and b/lecture_3/MotionControl.JPG differ diff --git a/lecture_3/aa274-lecture-3-4.pdf b/lecture_3/aa274-lecture-3-4.pdf new file mode 100644 index 0000000..80d90d1 Binary files /dev/null and b/lecture_3/aa274-lecture-3-4.pdf differ diff --git a/lecture_3/inv_pend_plot.mp4 b/lecture_3/inv_pend_plot.mp4 new file mode 100644 index 0000000..f34e0c9 Binary files /dev/null and b/lecture_3/inv_pend_plot.mp4 differ diff --git a/lecture_3/inverse_pendulum.png b/lecture_3/inverse_pendulum.png new file mode 100644 index 0000000..7d4b82a Binary files /dev/null and b/lecture_3/inverse_pendulum.png differ diff --git a/lecture_3/inverted_pendulum.py b/lecture_3/inverted_pendulum.py new file mode 100644 index 0000000..827e514 --- /dev/null +++ b/lecture_3/inverted_pendulum.py @@ -0,0 +1,254 @@ +#Starter code taken from AA274 Homework 1, P1_optimal_control.py +#Data (m, I, l, c) and model taken from https://www.scitepress.org/papers/2014/50186/50186.pdf, then modified + +#Note that this code has some oddities given its use of the scipy bvp solver +#Most have to do with variable dimensions, but the code should be fairly well adaptable +#To a different solver. Read the documentation for your solver to set variable dimensions accordingly. + +#bvp solver documentation: https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.integrate.solve_bvp.html + +import numpy as np +import math +import matplotlib.pyplot as plt +from scipy.integrate import solve_bvp + +#These are just for making the animation at the end +import os +import subprocess +import matplotlib.lines as lines +from matplotlib.figure import figaspect as fasp + +'Define constants' +lam = 600.0 +m = 2.41 +g = 9.81 +l = 5.22 +I = 0.116 +c = 0.005 +'Define commonly used combinations' +denom = I + m*np.power(l,2) +mgl = m*g*l +ml = m*l + +def ode_fun(tau, zz): + ''' + This function computes the dz given tau and z. It is used in the bvp solver. + Inputs: + tau: the independent variable. This MUST be the first argument. + This is a vector of size (m,) where m is the number of time-steps you take + In our case, m = 50. See "x" in main() for more details + + zz: the time-state matrix. This is of size (n,m) where n is the number of variables you have. + bvp_solver needs ode_fun to solve for all variables at all time-steps + This devolves into writing our ode for one time-step, then looping over all steps + + Output: + dz: the time-state derivative matrix. Returns a 2D numpy array (matrix) of size (n,m). + ''' + dz = np.zeros((zz.shape)) + + for i in range(tau.shape[0]): + 'Taking one column of zz corresponds to the values of our variables at a particular time-step' + z = zz[:,i] + + ''' + Now, treat z as your state vector, and create your ode as you normally would + First, do some simple operations to avoid constant retyping. + For example, the first one takes the sin of theta. + Recall theta = z_3, which is z[2] because of python 0-indexing. + ''' + st = np.sin(z[2]) + ct = np.cos(z[2]) + u = 0.5*(z[7]*ml*ct/denom - z[5]) + + ''' + Note that this is exactly as shown in the notes. You have a z_9 out in front, + and your first term is z_2, your second is u (as defined in terms of z), etc + ''' + dz_col = z[8]*np.array([ z[1], u, z[3], (1.0/denom)*(mgl*st - c*z[3] - ml*ct*u), + 0.0,-z[4], -(z[7]/denom)*(mgl*ct+ ml*st*u),z[7]*c/denom - z[6], + 0.0]) + + ''' + Now, since this is only for one time step, we need to put this column into + its appropriate spot in our overall dz matrix. Since it's the i'th time-step, + we put it into the i'th column of dz. Then we do the whole thing all over again + for the next time step, i+1 + ''' + dz[:,i] = dz_col + + return dz + + +def bc_fun(za, zb): + ''' + This function computes boundary conditions. It is used in the bvp solver. + Inputs: + za: the state vector at the initial time + This will correspond to all bc's of the form z_i(0) + + zb: the state vector at the final time + This is all bc's of the form z_i(1) + Output: + bca: tuple of boundary conditions at initial time + bcb: tuple of boundary conditions at final time + + IMPORTANT: All boundary conditions must be written as + f(za,zb) = 0 + + Example: z_2(0) = 5 would be written as za[1] - 5 + + Also important: Need to have as many bcs as variables + ''' + + #It's helpful to define initial and final states at the start + x0 = [0.0, 0.0, 0.0, 0.0] + xf = [10.0, 0.0, 0.0, 0.0] + + #We will first solve for the bonus equation given at the end of the setup + #Note that we're using all zb's, because zb corresponds to the final time + st = np.sin(zb[2]) + ct = np.cos(zb[2]) + u = 0.5*(zb[7]*ml*ct/denom - zb[5]) + + #Note that these next 2 arrays are just p^T and y_dot as given through the z variables + p_arr = np.array([zb[4],zb[5],zb[6],zb[7]]) + x_d_arr = np.array([zb[1],u,zb[3],(1.0/denom)*(mgl*st - c*zb[3] - ml*ct*u)]) + + #u(1)^2 + p(1)^T*y_dot(1) + lambda = 0 + free_time_bonus_eqn = np.power(u,2) + np.inner(p_arr,x_d_arr) + lam + + #Note za[1]-x0[1] is equivalent to z_2(0) - 0 = 0, aka z_2(0) = 0, and so on + #The free time equation goes with bcb, since it corresponds to a bc at the end time + bca = np.array([za[0]-x0[0],za[1]-x0[1],za[2]-x0[2],za[3]-x0[3]]) + bcb = np.array([zb[0]-xf[0],zb[1]-xf[1],zb[2]-xf[2],zb[3]-xf[3], + free_time_bonus_eqn]) + + #Stack conditions horizontally to make an np array of size (n,) + g = np.hstack((bca, bcb)) + return g + +def compute_controls(z): + ''' + This function computes the controls given the solution struct z. It is used in main(). + Input: + z: the solution struct outputted by solve_bvp + Outputs: + accel: the x-ddot control for the wheel + ''' + + #z.y is the (n,m) output matrix of our solution at all time-steps + vals = z.y + + #cos of theta at all time steps + ct = np.cos(vals[2,:]) + + #use our equation for u, and use numpy's default element-wise multiplication + #to solve for u at all time-steps simultaneously. + accel = 0.5*(vals[7,:]*ml*ct/denom - vals[5,:]) + + return accel + +def main(): + ''' + This function solves the specified bvp problem and returns the corresponding optimal contol sequence + Outputs: + z: Optimal solution for all variables (struct defined in scipy bvp_solver documentation) + accel: optimal x_ddot control sequence (np array of size (m,)) + ''' + + 'First, we define our mesh, aka how fine the discretization of time is' + x = np.linspace(0,1,50) #Creates a vector of 50 equally spaced points between 0 and 1, inclusive: [0, 0.0204, 0.0408, ... , 1] + + ''' + Next, we define our initial guess. + Note that this is mostly random except for the last variable, + which corresponds to t_f, which I guessed was around 5.0 + ''' + initial_guess = np.array([5.0,1.5,-0.2,0.1,-10.0,-5.0,5.0,10.0,5.0]) + + ''' + For this bvp solver, you must have one n-length guess for each time-step in x. + Since we have 50 time-steps and 9 variables, we need to have a (9,50) matrix + Each column of y[:,i] corresponds to time-step x[i] + ''' + y = np.tile(initial_guess, (x.size,1)) #create (50,9) matrix where each row is initial_guess + y = y.T #make it into the right shape (9,50) + z = solve_bvp(ode_fun, bc_fun, x, y) #call bvp solver + accel = compute_controls(z) #compute controls from your solution + return z, accel + +if __name__ == '__main__': + z, accel = main() + + #The last variable we solved for was t_f + tf = z.y[8,0] + + #Our mesh goes from tau = (0,1). Need to scale by tf + time_mesh = z.x*tf + + unicycle_pos = z.y[0,:] + unicycle_angle = z.y[2,:] + + + ''' + #Uncomment to see plots + + plt.figure(figsize=(12, 4)) + plt.subplot(1, 3, 1) + plt.plot(time_mesh, unicycle_pos,'k-',linewidth=2) + plt.grid('on') + plt.xlabel('X') + plt.ylabel('Y') + plt.axis([0, 6, -2, 6]) + plt.title('Optimal Wheel Position') + + plt.subplot(1, 3, 2) + plt.plot(time_mesh, accel,'k-',linewidth=2) + plt.grid('on') + plt.xlabel('Time') + plt.ylabel('Acceleration') + plt.axis([0, 6, -15, 15]) + plt.title('Optimal Wheel Acceleration') + + plt.subplot(1, 3, 3) + plt.plot(time_mesh, unicycle_angle,'k-',linewidth=2) + plt.grid('on') + plt.xlabel('X') + plt.ylabel('Y') + plt.axis([0, 6, -1, 1]) + plt.title('Optimal Rod Angle') + + plt.show() + + ''' + + #Make a movie to see how the rod and wheel move over time + files = [] + pos = z.y[0,:] + ang = z.y[2,:] + rod_pos = np.vstack((pos,pos+2*l*np.cos((np.pi/2)-ang),np.ones(pos.size),1+2*l*np.sin((np.pi/2)-ang))) + + w,h = fasp(1.) + fig = plt.figure(figsize=(w,h)) + ax = fig.add_subplot(111) + for i in range(50): # 50 frames + rp = rod_pos[:,i] + plt.cla() + plt.plot(pos[i],1,'ro') + line = lines.Line2D([rp[0],rp[1]],[rp[2],rp[3]]) + ax.add_line(line) + plt.axis([-4, 14, 0.0, 18]) + fname = '_tmp%03d.png' % i + print('Saving frame', fname) + plt.savefig(fname) + files.append(fname) + + subprocess.call([ + 'ffmpeg', '-framerate', '15', '-i', '_tmp%03d.png', '-r', '30', '-pix_fmt', 'yuv420p', + 'video_name.mp4' + ]) + + for file_name in files: + os.remove(file_name) + diff --git a/lecture_3/main.pdf b/lecture_3/main.pdf index 94b92c0..d93044e 100644 Binary files a/lecture_3/main.pdf and b/lecture_3/main.pdf differ diff --git a/lecture_3/main.tex b/lecture_3/main.tex index 6e06037..348d54f 100644 --- a/lecture_3/main.tex +++ b/lecture_3/main.tex @@ -1,9 +1,10 @@ -\documentclass[twoside]{article} +\documentclass[twoside]{article} \usepackage[math]{kurier} \usepackage[sc]{mathpazo} \usepackage{amsmath} \usepackage[T1]{fontenc} +\usepackage{listings} \renewcommand{\sfdefault}{kurier} @@ -67,6 +68,9 @@ \section{Introduction} Finally, we will combine open-loop and closed-loop approaches by introducing trajectory tracking. \section{Constrained Optimization} + +\subsection{Formalization} + Before presenting the optimal control problem it is first useful to briefly review a standard constrained optimization problem given by \begin{equation} \label{const} @@ -91,27 +95,155 @@ \section{Constrained Optimization} One way to think about the necessary conditions (Eq. \ref{noc}) for optimality is as a filter that takes in all points $x$ and outputs a subset of potential optimal points $x^*$. If several potential points exist that satisfy the NOC, these could be local optima. -\section{Optimal Control Problem} +\subsection{Example} + +Let's present a constrained optimization problem as an example. \\ +Let $\mathbf{x}$ be $\mathbf{x} = (x_1, x_2)$. The constrained optimization problem is: +\begin{equation}\label{ex_optim} +\begin{array}{c} +\min_{\mathbf{x}} = x_1 + x_2 \\ +\text{subject to } x_1^2 + x_2^2 = 2 \\ +\end{array} +\end{equation} + + +In other words, we want to find $x_1$ and $x_2$ such that $x_1 + x_2$ is minimal and the point $\mathbf{x} = (x_1, x_2)$ belongs to the circle of radius $\sqrt(2)$ centered on O, as shown by the figure. + +\begin{figure}[ht] + \centering + \includegraphics[scale = 0.7]{Constrainted_Optim1.png} + \caption{A constrained Optimization problem} + \label{fig:constr_optim} +\end{figure} + +In this problem, +\[ f(\mathbf{x}) = x_1 + x_2 \] +\[ h(\mathbf{x}) = x_1^2 + x_2^2 - 2 \] +The \textit{Lagrangian }function is +\begin{equation} +\begin{array}{c c c c c} + L(\mathbf{x}, \lambda) & =& f(\mathbf{x}) & +& \lambda h(\mathbf{x}) \\ + & =& x_1 + x_2 & + & \lambda(x_1^2 + x_2^2 - 2) +\end{array} +\end{equation} + + +The necessary conditions for optimality are: + +\begin{equation} \label{noc_optim1} +\begin{array}{c c c} +\nabla_\mathbf{x} L(\mathbf{x}^*,\lambda^*) &=& 0 \\ +\nabla_\lambda L(\mathbf{x}^*,\lambda^*) &=& 0 \\ +\end{array} +\end{equation} + +Here, + +\begin{equation} \label{noc_optim_dx} +\nabla_\mathbf{x} L(\mathbf{x}^*,\lambda^*) = +\begin{bmatrix} + \frac{\partial L}{\partial x_1}(\mathbf{x}^*,\lambda^*) \\ + \frac{\partial L}{\partial x_2}(\mathbf{x}^*,\lambda^*) +\end{bmatrix} += +\begin{bmatrix} + 1 + 2 \lambda^* x_1^* \\ + 1 + 2 \lambda^* x_2^* +\end{bmatrix} += +\begin{bmatrix} + 0 \\ + 0 +\end{bmatrix} +\end{equation} + +and +\begin{equation} \label{noc_optim_dlambda} +\nabla_\lambda L(\mathbf{x}^*,\lambda^*) = +x_1^*^2 + x_2^*^2 - 2 = 0 +\end{equation} +Equation (\ref{noc_optim_dx}) gives +\begin{equation} \label{ex_sol_part1} + \begin{array}{c} + x_1^* = - \frac{1}{2 \lambda^*} \\ + x_2^* = - \frac{1}{2 \lambda^*} + \end{array} +\end{equation} + +Injecting (\ref{ex_sol_part1}) into (\ref{noc_optim_dlambda}) gives + +\begin{equation} \label{ex_sol_part2} + \begin{array}{c} + \frac{1}{4 \lambda^*^2} + \frac{1}{4 \lambda^*^2} - 2 = 0 \\ + \iff \lambda^* = \pm \frac{1}{2} + \end{array} +\end{equation} + +Injecting (\ref{ex_sol_part2}) in (\ref{ex_sol_part1}): \\ +\[x_1 + x_2 = -2 \text{ if } \lambda^* = \frac{1}{2} \] and +\[x_1 + x_2 = 2 \text{ if } \lambda^* = -\frac{1}{2} \] As a result, we choose $\lambda^* = \frac{1}{2} $ for which $x_1 + x_2$ is minimal. + +\paragraph{Conclusion:} The optimal solution of the constrained problem (\ref{ex_optim}) is: +\begin{equation}\label{ex_sol2} + \begin{array}{c} + \mathbf{x}^* = (-1, -1) + \end{array} +\end{equation} + +The solution (\ref{ex_sol2}) is presented in figure~\ref{fig:ex_sol}. + +\begin{figure}[ht] + \centering + \includegraphics[scale = 0.7]{Constrainted_Optim2.png} + \caption{Optimal solution of the constrained problem} + \label{fig:ex_sol} +\end{figure} + + + +\section{Motion Control} + +With an understanding of kinematic models for nonholonomic systems, the next step is applying those constraint relations to control the vehicle along a path from a given initial configuration to a desired final configuration. This control of a robot's movements from point A to point B is referred to as \textit{Motion Control}. + +\begin{figure}[H] +\centering +\includegraphics[width=0.4\textwidth]{MotionControl} +\caption{Simple Robot Path} +\end{figure} + +The aim of this discussion will be to provide the fundamentals of optimal control and trajectory optimization, as well as to develop an appreciation for the methods of achieving those paths via open-loop control, closed-loop control, and their combination. + +\subsection{Optimal Control Problem} + +The mathematical definition of the optimal control problem can be written as such: \begin{equation} \label{optcont} \begin{split} -\underset{u}{\text{min}} \:\: &h(x(t_f),t_f) + \int_{t_0}^{t_f} g(x(t),u(t),t) dt \\ -\text{subject to} \:\: &\dot{x}(t) = a(x(t),u(t),t) \\ -&x(t) \in \mathcal{X}, \quad u(t) \in \mathcal{U} +\underset{\mathbf{u}}{\text{min}} \:\: &h(\mathbf{x}(t_f),t_f) + \int_{t_0}^{t_f} g(\mathbf{x}(t),\mathbf{u}(t),t) dt \\ +\text{subject to} \:\: &\dot{\mathbf{x}}(t) = a(\mathbf{x}(t),\mathbf{u}(t),t) \\ +&\mathbf{x}(t) \in \mathcal{X}, \quad \mathbf{u}(t) \in \mathcal{U} \end{split} \end{equation} -In general, $x(t) \in \mathbb{•}{R}^n$, $u(t) \in \mathbb{R}^m$, and the initial condition $x(0) = x_0$ is given. In the context of this course the constraint $x(t) \in \mathcal{X}$ is typically relaxed to just be $x(t) \in \mathbb{R}^n$. In other words there are no state constraints. +In general, $\boldsymbol{x}(t)\in R^{n}$, $\boldsymbol{u}(t)\in R^{m}$, and the initial condition $\boldsymbol{x}(t_0)=\boldsymbol{x}_0$ is given. In the context of this course, the constraint $x(t) \in \mathcal{X}$ is typically relaxed to just be $\boldsymbol{x}(t)\in R^{n}$. In other words there are no state constraints. + +The first constraint in the problem defines the differential constraints that represent the system dynamics, which are typically nonlinear. + +Solving this optimal control problem involves minimizing a cost function, often denoted as $J$, which is the sum of (1) an integral of running cost that, at each time interval, penalizes the system based on the current state, $\boldsymbol{x}(t)$, and input, $\boldsymbol{u}(t)$ (i.e. we may want to minimize control effort), and (2) a terminal cost which penalizes the final state (i.e. we may want to minimize $t_f$). + +It is important to remember that the integral is simply an additive operation, so this problem may be approached using a discretization of the integral where we add immediate stage-wise costs. With this is mind, we prefer to use this generalized model for two reasons: (1) there are a number of tasks that may be modeled by additive costs (i.e. fuel consumption, control effort, etc.), and (2) it dramatically simplifies computation (linear constraint with a number of key, deep, structural results that stem from such an additive assumption). -The problem given by Eq. \ref{optcont} contains many details. In the objective function, the term $h(x(t_f),t_f)$ represents a cost associated with the end condition. For example if you wanted to minimize time you could have $h = t_f$. Also in the objective function, the integral produces a path cost, or in other words produces penalties along the way from $t_0$ to $t_f$. This would be useful if you wanted to minimize fuel use: for example you could set $g = u^2$. +For those interested in learning more about Optimal Control Theory, a recommended reference to check out is: \textit{D. K. Kirk. Optimal Control Theory: An introduction. Dover Publications, 2004.} -The first constraint in the problem defines the differential constraints that represent the system dynamics. These in general are nonlinear which is why they are represented as $a(x(t),u(t),t)$. +\subsection{Form of Optimal Control} -Problem \ref{optcont} is typically solved to yield an open-loop trajectory and control, which are denoted as $x^*(t)$ and $u^*(t)$. Note that these are entire trajectories through time, not just individual points as we saw in Eq. \ref{const}. Since the control is open loop we also typically write +There exist different forms of an optimized control function, the focus of which revolves around the use of feedback in the iteration of the aforementioned cost function. For notation's sake, if our control input $\mathbf{u}^*(t)$ represents optimality, then $\mathbf{u}^*(t)$ is the control function that minimizes the cost function in the optimization problem. -$$u^*(t) = f(x(t_0),t)$$ +If the optimal control input resulting from problem \ref{optcont} can be written in the form $\mathbf{u}^*(t) = \pi(\mathbf{x}(t),t)$, then the optimal control is said to be in closed-loop form. Alternatively, if it is of the form $\mathbf{u}^*(t) = f(\mathbf{x}(t_0),t)$, then the optimal control is said to be open-loop. -to indicate that the control is a function of time only, and does not utilize any state feedback. +There are benifits to both closed- and open-loop control. While closed-loop control is more computationally intensive, open-loop control is less robust to disturbances (i.e. a gust of wind). Often, a good compromise is is a two-step design: +$$\mathbf{u}^*(t) = u\mathbb{u_d}(t) + \pi(\mathbf{x}(t),\mathbf{x}(t) - \mathbf{x_d}(t))$$ +Here, open-loop control is first used to generate a desired trajectory, $\mathbf{x_d}$, and it's corresponding input, $\mathbf{u_d}$, and then closed-loop control is performed on a simplified trajectory tracking problem. It is important to realize the difference between solving these optimal control problems and solving general optimization problems. As mentioned, in optimal control we are solving for \textit{trajectories}, and not just points. For any given set of boundary conditions there are an infinite number of possible trajectories, and with the dynamics constraints also involved this becomes a formidable task. @@ -119,6 +251,7 @@ \section{Optimal Control Problem} + \subsection{Indirect Methods} %--------------------------Example of Indirect Methods--------------------------% @@ -147,44 +280,51 @@ \subsection{Indirect Methods} The first equation represents the derivative of the Hamiltonian with respect to your individual Lagrange Multipliers which interestingly enough gives back the kinematic state equations of the robot. The second equation is called the costate of your function - looking to optimize the function over continuous infinite time, the lagrange multipliers become functions of time. The final equation is the optimality condition which is simply the Hamiltonian differiniated with respect to the constraint producing an algebraic equation. This is \textbf{under the assumption that we have no state constraints} which is sufficient for our focus of the class. The tedious part of optimal control is now solving these equations: we have produced 2N optimization variables in $x^*(t)$ and $p^*(t)$ with one M varibale $u^*(t)$ totaling 2N + M variables alongside our 2N differinitable equations and M algebraic equation. +\subsubsection{Boundary Conditions} To solve we must impose boundary conditions on our function with respect to the robot's final position and time such that: \begin{equation} \label{hamiltonianBCs} \begin{split} -[\frac{\partial{h}}{\partial{x}}(x^*(t_f), t_f - p^*(t_f))]^T\delta{x_f}\ + [H(x^*(t_f), u^*(t_f), p^*(t_f), t_f + \frac{\partial{h}}{\partial{t}}(x^*(t_f), t_f)]\delta{x_f} = 0\\ +[\frac{\partial{h}}{\partial{x}}(x^*(t_f), t_f - p^*(t_f))]^T\delta{x_f}\ + [H(x^*(t_f), u^*(t_f), p^*(t_f), t_f) + \frac{\partial{h}}{\partial{t}}(x^*(t_f), t_f)]\delta{t_f} = 0\\ \end{split} \end{equation} +Where $h(\mathbf{x}(t_f),t_f)$ comes from the loss function, $J := h(\mathbf{x}(t_f),t_f) + \int_{t_0}^{t_f} g(\mathbf{x}(t),\mathbf{u}(t),t) dt$ and $H(x^*(t_f), u^*(t_f), p^*(t_f), t_f)$ is the Hamiltonian defined in \ref{hamiltonian} +\\ +\\ -which produces four possible solutions to satisfy the equation: - -$t_f$ = fixed and $x(t_f)$ = fixed \dots $\delta{t_f}$ = 0 and $\delta{x(t_f)}$ = 0 +This produces four possible solutions to satisfy the equation in \ref{hamiltonianBCs}: +\subsubsection{Boundary Condition Scenario 1: $t_f$ = fixed and $x(t_f)$ = fixed \dots $\delta{t_f}$ = 0 and $\delta{x(t_f)}$ = 0} +Plug these values of $\delta{t_f}$ and $\delta{x(t_f)}$ into \ref{hamiltonianBCs} $$\textbf{BC}: x^*(t_0) = x_0$$ $$x^*(t_f) = x_f$$ +If $t_f$ and $x(t_f)$ are fixed, then the equation in \ref{hamiltonianBCs} goes to $0$, and the optimal boundary conditions are simply the original constraints given to us. -$t_f$ = fixed and $x(t_f)$ = free \dots $\delta{t_f}$ = 0 and $\delta{x(t_f)}$ = arbitrary +\subsubsection{Boundary Condition Scenario 2: $t_f$ = fixed and $x(t_f)$ = free \dots $\delta{t_f}$ = 0 and $\delta{x(t_f)}$ = arbitrary } +Plug the above values of $\delta{t_f}$ and $\delta{x(t_f)}$ into \ref{hamiltonianBCs} $$\textbf{BC}: x^*(t_0) = x_0$$ $$[\frac{\partial{h}}{\partial{x}}(x^*(t_f), - p^*(t_f))] = 0$$ - - -$t_f$ = free and $x(t_f)$ = fixed \dots $\delta{t_f}$ = arbitrary and $\delta{x(t_f)}$ = 0 +Since $\delta t_f = 0$, the left hand side of \ref{hamiltonianBCs} goes to zero, so our extra boundary condition (along with the initial constraints given to us) is the right hand side of the equation without the $\delta{x_f(t_f)}$ in it, since it is simply a constant. +\subsubsection{Boundary Condition Scenario 3: $t_f$ = free and $x(t_f)$ = fixed \dots $\delta{t_f}$ = arbitrary and $\delta{x(t_f)}$ = 0 } +Plug the above values of $\delta{t_f}$ and $\delta{x(t_f)}$ into \ref{hamiltonianBCs}) $$\textbf{BC}: x^*(t_0) = x_0$$ $$x^*(t_f) = x_f$$ -$$[H(x^*(t_f), u^*(t_f), p^*(t_f), t_f + \frac{\partial{h}}{\partial{t}}(x^*(t_f), t_f)]\delta{x_f} = 0$$ - - - -$t_f$ = free and $x(t_f)$ = free \dots $\delta{t_f}$ = arbitrary and $\delta{x(t_f)}$ = arbitrary +$$[H(x^*(t_f), u^*(t_f), p^*(t_f), t_f) + \frac{\partial{h}}{\partial{t}}(x^*(t_f), t_f)] = 0$$ +Since $\delta{t_f}$ = arbitrary and $\delta{x(t_f)}$ = 0, the right hand side of \ref{hamiltonianBCs} goes to zero, so our extra boundary condition (along with the initial constraints given to us) is the left hand side of the equation without the $\delta{t_f}$ in it, since it is simply a constant. +\subsubsection{Boundary Condition Scenario 4: $t_f$ = free and $x(t_f)$ = free \dots $\delta{t_f}$ = arbitrary and $\delta{x(t_f)}$ = arbitrary } +Plug the above values of $\delta{t_f}$ and $\delta{x(t_f)}$ into \ref{hamiltonianBCs} $$\textbf{BC}: x^*(t_0) = x_0$$ $$[\frac{\partial{h}}{\partial{x}}(x^*(t_f), - p^*(t_f))] = 0$$ $$[H(x^*(t_f), u^*(t_f), p^*(t_f), t_f + \frac{\partial{h}}{\partial{t}}(x^*(t_f), t_f)]\delta{x_f} = 0$$ +Since $\delta{t_f}$ = arbitrary and $\delta{x(t_f)}$ = arbitrary, we can split up \ref{hamiltonianBCs} into two equations, where the left and right hand sides of the equation are set to zero. Once these equations are split into two, the $\delta{t_f}$ and $\delta{x(t_f)}$ constants can be ignored, leading to the two extra conditions described above, along with the original given constraint, $x^*(t_0) = x_0$ -The last scenario is what we call a free final time state and introduces an unkown variable to our system of equations making it a set of 2N + M + 1. Think of this as an optimal control problem where we look to minimize the cost of our function at any given final time. We look at a thorough example for clarification. +The last scenario ($t_f$ = free and $x(t_f)$ = free) is what we call a free final time state and introduces an unknown variable to our system of equations making it a set of 2N + M + 1. Think of this as an optimal control problem where we look to minimize the cost of our function at any given final time. We look at a thorough example for clarification. +\subsection{$t_f$ = free and $x(t_f)$ = free Constrained Optimization Example} Let's consider an example of free final time. Consider a point that moves in one dimension. Control input of the system is acceleration of the point. Boundary conditions for positions and velocities at initial and final times are given, but the final time is not fixed. \begin{equation}\label{ExampleSCAndBC}%State Constraints and Boundary Conditions @@ -273,6 +413,66 @@ \subsection{Indirect Methods} \end{equation} Then we can input these BVP and BC into Python to solve the problem. + +\subsubsection{Python Example for Indirect Methods - Problem Setup} + +Let's take a look at the unicycle model from a different perspective, and see how to solve this model in Python. We start with a side-view of a coupled rod-and-wheel system - ie a unicycle with a seat. This problem is derived from the paper by Tum et. al \cite{inverse_pendulum}. + +\begin{figure}[h!] + \centering + \includegraphics[width = \textwidth]{inverse_pendulum.png} + \caption{Inverse pendulum model \cite{inverse_pendulum}. Note that this can work for a unicycle as well, with some additional transformations on the input coordinates. We will ignore those for now and assume we have direct acceleration control of our wheel.} +\end{figure} + +The dynamics of the system \cite{inverse_pendulum} are given as follows: + +\begin{align*} + y &= \begin{bmatrix} x \\ \dot{x} \\ \theta \\ \dot{\theta} \end{bmatrix}\\ + \dot{y} &= \begin{bmatrix} + \dot{x} \\ u \\ \dot{\theta} \\ + \frac{1}{I+ml^2}(mgl\sin(\theta) - c\dot{\theta}-ml\cos(\theta)u) + \end{bmatrix} +\end{align*} + +Where $I,m,l,c$ are all known. + +Note that since this is a python example, we will spend little time deriving the model. For more information, see \cite{inverse_pendulum}. Our functional to minimize will be + +\[J = \int_0^{t_f} (600 + u^2 )dt \] + +Setting our state to be $z = [y,p,t_f]$ with time-rescaling, and using Hamiltonian techniques we get the following system: + +\begin{align*} + \frac{dz}{d\tau} &= z_9\begin{bmatrix} + z_2 \\ u \\ z_4 \\ + \frac{1}{I+ml^2}(mgl\sin(z_3) - cz_4-ml\cos(z_3)u) \\ + 0 \\ -z_5 \\ \frac{-z_8}{I+ml^2}(mgl\cos(z_3) + ml\sin(z_3)u) \\ \frac{z_8c}{I+ml^2}-z_7 \\ 0 + \end{bmatrix} \\ + u &= \frac{1}{2}(\frac{z_8ml\cos(z_3)}{I+ml^2}-p_2) +\end{align*} + +Finally, we have our boundary conditions: We want to go from a vertical position at the origin to a vertical position at 10 meters. This translates to + +\begin{align*} + z_0(0) &= z_1(0) = z_2(0) = z_3(0) = 0 \\ + z_0(1) &= 10, z_1(1) = z_2(1) = z_3(1) = 0 +\end{align*} + +We also get a bonus boundary equation from our free $t_f$: + +\[u(1)^2 + p(1)^T\dot{y}(1) + 600 = 0\] + +\subsubsection{Python Example for Indirect Methods - Code} +This code is available in the repo as ``inverted\_pendulum.py''. The crux of the code will revolve around using the scipy bvp\_solver package. There are many bvp solvers out there, but in general they all need the following things: + +\begin{enumerate} + \item An ode function ($\frac{dz}{d\tau}$) + \item A boundary condition function + \item An initial guess +\end{enumerate} + +Please view the commented code for a thorough description of the process. The file ``inv\_pend\_plot.mp4'' shows a short video of the solution to the system. + %--------------------------End of Example for Indirect Methods--------------------------% %% ------- WINTER 2019, LECTURE 4 SCRIBES, MAKE YOUR EDITS BELOW THIS LINE -------- %% @@ -566,10 +766,13 @@ \section{Trajectory Tracking} Notes from ee363 at Stanford University \textit{https://stanford.edu/class/ee363/lectures/lyap.pdf} +\bibitem{inverse_pendulum} +M. Tum, G. Gyeong, J. Park and Y. Lee, "Swing-up Control of a Single Inverted Pendulum on a Cart With Input and Output Constraints", \textit{Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics}, vol. 1, no. 1, pp. 475-482, 2014. + \end{thebibliography} \subsubsection*{Contributors} -Winter 2019: [Your Names Here] +Winter 2019: Michael Kossyrev, Zoe Ghiron, Adam \\ Winter 2018: Joseph Lorenzetti, Kenneth Hoffmann, Oriana Peltzer, Zhe Huang, William Mangram \end{document}