Ethan Evans (Georgia Institute of Technology); Andrew Kendall (Georgia Institute of Technology); Georgios Boutselis (Georgia Institute of Technology ); Evangelos Theodorou (Georgia Institute of Technology)
There is a rising interest in Spatio-temporal systems described by Partial Differential Equations (PDEs) among the control community. Not only are these systems challenging to control, but the sizing and placement of their actuation is an NP-hard problem on its own. Recent methods either discretize the space before optimziation, or apply tools from linear systems theory under restrictive linearity assumptions. In this work we consider control and actuator placement as a coupled optimization problem, and derive an optimization algorithm on Hilbert spaces for nonlinear PDEs with an additive spatio-temporal description of white noise. We study first and second order systems and in doing so, extend several results to the case of second order PDEs. The described approach is based on variational optimization, and performs joint RL-type optimization of the feedback control law and the actuator design over episodes. We demonstrate the efficacy of the proposed approach with several simulated experiments on a variety of SPDEs.
Start Time | End Time | |
---|---|---|
07/15 15:00 UTC | 07/15 17:00 UTC |
In this paper, the authors consider the task of jointly co-design an optimal control policy and actuation architecture (specifically, actuator locations) for first and second order stochastic partial differential equations. They formulate the problem as an optimization on Hilbert Spaces of nonlinear PDEs, and show that through an appropriate change of measure, the problem can be formulated as one that can be tackled through a spatio-temporal reinforcement learning based approach. They demonstrate the efficacy of their method on four simulated case studies: a temperature reaching task on a 1D heat equation, a velocity reaching task on a Burgers equation, a voltage suppression task on a Nagumo equation, and an oscillation suppression task on the Euler-Bernoulli equation with Kelvin-Voigt damping. Overall, I found the paper to be well written, motivated, and to address an important problem. The proposed solution is theoretically sound, novel, and empirically well supported, and I believe that it would make a nice contribution to the conference. Below I have some minor suggestions that the authors may wish to take into consideration for the final submission. In the literature review, the authors may wish to mention some of the work addressing architecture co-design for LTI systems. Relevant references include: - Matni, Nikolai, and Venkat Chandrasekaran. "Regularization for design." IEEE Transactions on Automatic Control 61.12 (2016): 3991-4006. -Lin, Fu, Makan Fardad, and Mihailo R. Jovanović. "Design of optimal sparse feedback gains via the alternating direction method of multipliers." IEEE Transactions on Automatic Control 58.9 (2013): 2426-2431. In definition III.1, I don’t think that the operator \mathscr{L} is ever defined. In Section IV: The result in this section is not initially well motivated: why is a measure theoretic view of variational optimization useful, and why is a change of measures result the appropriate technical tool for achieving this? In fact, it was not until the final few paragraphs that it was clear that these methods were needed to translate the co-design goal into one amenable to RL techniques — I would recommend a slight reorganization of this section to help motivate the technical results being presented to the reader in terms of formalizing eq (1) into an actionable cost function.
My sense is the scope of the paper is too broad. The paper is contributing both fundamental theory for stochastic partial differential equations, as well as algorithms for optimization of actuator placement and optimal policy design. I find there is too much here, and I question if the paper can have impact if theoretical readers will gloss over the optimization algorithms while practical readers will gloss over the math. I personally would suggest to write the theoretical results in a more tutorial style (the authors could submit those results to a theory journal). Then explain clearly how the theory can improve the solution of the optimization problem. I have to say I was quite lost in the technicalities of the paper to make sense of the overall result.
The paper has made significant contribution in tackling the challenging coupled task of joint policy optimization and actuator co-design. The approach is based on a general principle from thermodynamics, which is very interesting. The paper has also extended existing formulation of Girsanov Theorem to a second order version, which is of research significance. The proposed algorithm has also been validated by experiments on reaching tasks and suppression tasks.