Learning of Sub-optimal Gait Controllers for Magnetic Walking Soft Millirobots

Utku Culha, Sinan Ozgun Demir, Sebastian Trimpe, Metin Sitti

Abstract

Untethered small-scale soft robots have promising applications in minimally invasive surgery, targeted drug delivery, and bioengineering applications as they can access confined spaces in the human body. However, due to highly nonlinear soft continuum deformation kinematics, inherent variability during fabrication on the miniature scale, and lack of accurate models, the conventional control methods cannot be easily applied. Adaptivity of the robot control is additionally crucial for medical operations, as operation environments show large variability and robot materials may degrade or change over time, which would have deteriorating factors on the robot motion and task performance. In this work, we propose using a probabilistic learning approach for millimeter-scale magnetic walking soft robots using Bayesian optimization (BO) and Gaussian processes (GPs). Our approach provides a data-efficient learning scheme to find controller parameters while optimizing the stride length performance of the walking soft millirobot. We demonstrate adaptation to fabrication variabilities and different walking surfaces by adopting our controller learning system to three robots within a small number of physical experiments.

Live Paper Discussion Information

	Start Time	End Time
	07/16 15:00 UTC	07/16 17:00 UTC

Virtual Conference Presentation

Paper Reviews

Review 2

The work has an excellent motivation, meaningful problem statement and goal. The scope of the work is clear, and it is refreshing to read a paper which so clearly articulates what exactly is being studied (and what aspects of the work are not being claimed as novel). There is a clear need for this type of result in the micro-robotics community. This data-driven approach nicely complements the recent state of the art advances in microrobotic fabrication and control, which are primarily guided by physics-based models. This work studies whether a data-driven approach could yield a better controller for one particular gait of walking for a flexible magnetic sheet. The authors wisely choose to study the exact device already published in several works, which allows them to make fair comparisons in a meaningful and helpful way to the community. I think this type of approach should be adopted as an additional tool for the micro-robotics community, which (unlike most sub-fields in robotics) thus far mostly avoided the use of ML techniques for design and control. The authors have accurately captured the micro-robotics specific data collection challenge here, and so I see this work as being very valuable for the community. While a physics-based approach will likely continue to play a dominant role in micro-robotics research approach, publications such as this one will help the community evaluate the value of data-driven approaches. The primary thesis of this work is that due to a large search space (four controller parameters over a continuum of values), it is not feasible to experimentally evaluate the entire search space to find the optimal control inputs. Logically, this motivates the use of Bayesian optimization. However, the paper does not explicitly test the thesis, which would require the authors to test a "brute force" search over the search space in a random or systematic manner (with the same 20x3=60 experiments). I presume that doing so would result in a poorer stride length than the BO method, but it would allow you to more directly claim the success of the method. Comments on the paper: A reader of this paper likely needs to be familiar with Bayesian optimization and Guassian process method to understand the paper fully because the methods are only explained briefly. Given that the most valuable target audience of this work may be the micro-roboticists who thus far have shied away from data-driven methods, those people may not have the background needed to read the paper. My suggestion for this paper to have maximum impact would be to include some additional basic descriptions of the algorithms used to "hold the hand" of the reader in section III. The primary result of the paper is not stated in the abstract. The authors interchangeably use the terms millirobot and miniature robot. It might be clearer to choose one. On page 3, it is stated that there are 203,520 possible parameter sets. This is an arbitrary number dependent on the step sizes, which are not stated. Can you make a more disciplined argument for a particular step size based on the expected sensitivity to each parameter in the observed data? The results of Table 1 show that robot 2 gets a better stride length without the prior! This is counter to the claims made throughout the discussion section which claim that the prior always helps. Figure 7: is the vertical axis the stride length, or the improvement in stride length? The Conclusions section of the paper is summarizing and re-stating claims that are already made elsewhere. It does not add value to the paper and should be removed.

Review 3

## 1. General feel: The authors adopt a method for manufacturing millimiter-scale "millirobots" that are activated using an external field, and then apply Bayesian Optimization to determine a suitable parameter set from a search space containing four free variables. The motivation seems reasonable, namely to improve the locomotive efficiency of such a robot. However, I question the approach and the results. Major comments: 1. Why Bayesian Optimization? There are numerous other possible algorithms for optimizing robot gaits, including "Intelligent Trial-and-Error" [1], genetic algorithms [2, 3], policy gradients [4], and many others throughout various areas of robotics. 2. Why is your data so noisy? The paper you are using as your benchmark (reference [27 of submission 1301], specifically the most-relevant section on walking P 49-54 of the supplementary information, but also for other gait classes) has much lower noise than e.g. your Fig. 4, 5, 7. - Related, you report many values as "X +/- Y" without stating the width of this confidence interval. Is that one SD? 95% CI? etc. - Related, did you report the benchmarks' actual mean and variance, or are the reported "benchmarks" the values obtained on your hardware with their parameters (by assumption, you are referring to the parameters in Fig. S13 on Page 49 of the supplementary information of [27 of submission 1301])? 3. The data noise puts into further question the utility of the reported results. Combining the points in item 2 and 3 above, it is unclear what p-values you use for drawing conclusions. Thus, even though you might actually have presented significant improvements and simply have issues with your specific hardware manufacturing setup, it is unclear how others can benefit from and build upon these reported results. ## 2. Technical merit, etc.: I'll reference the text as p[page in manuscript = PDF page -1].[lines] p1 The claim "safe human, which are hard to achieve using conventional rigid materials" is misleading. Collision avoidance is a well-developed field, including quite impressive results even dating back to, for one of many examples, 2012 [5] p1 "However, these controllers typically depend on the continuous sensing of symmetric body deformations and computationally heavy model solutions" ... what does this mean? p1 The tasks mentioned in the end of the second paragraph are not unique to medical robotics. "Soft mobile robots targeting medical applications have further constraints such as the dynamic task environment, complex deformation kinematics, fabrication-dependent performance variations, and actuation/sensing limitations, which require adaptive and data-efficient control methods [14]." p2 How much does restricting alpha affect the results? This could be an interesting study: finding the marginal contribution of each parameter, and which ranges are useful. p3 "In addition to the virtual infinite degrees of freedom inherited by the soft materials, the controller parameters existed in a continuous space, making an exhaustive manual search using physical experiments impractical." This is nonsense. Doing grid search or some guided binary search should reduce the number of experiments to a manageable number. This should be compared to the proposed GP method. Furthermore, design of experiments is a well-developed field, and might provide further insight. Not to mention other methods like PSO, gradient descent, etc., that could be adapted, in addition to the algorithms mentioned in my comments above (references [1,2,3,4]). p3 "The magnetic soft millirobots in our paper do not have an inverse kinematic (see Fig. 2b) or dynamics model that would allow us to run a systematic numerical analysis to find the control parameters for the optimum stride length performance." While this is partially true, it is severely downplaying the extensive characterization, theoretical analysis (definitely incomplete, but still more than the authors of submission 1301 would make it seem) and data available in [27 of submission 1301]. p3 "Therefore, following the arguments in [27], α1 and α2 are limited to [10 - 50]° and [40 - 80]° respectively." What arguments? State briefly. p3-4 Please explain what the primes ' and asterisks * mean when applied to your variables. I interpret ' as transpose, but subtracting \theta^' from \theta results in a dimension mismatch for a vector \theta and previously you say it is describing a 1-D case. p4 "The center of the uniform magnetic field coincides with the center of the test environment and has a size of 40 mm3" Is it 40 mm edge length, or 40 mm3 total volume? p4 If you colored the x'es with the same color scheme (maybe with a black outline), it might make it easier to interpret the error. Currently it is unclear how effective the regression is in this parameter plane. p5 The experimental design seems unjustified/random. WHy 18 trials, then 38, then 17, then 50 for each treatment? p6 Table 1 Are these variances plus minus 40, or variances *of the mean*? 0.4 seems lower than what your data suggest (e.g. Fig. 4,5) p7 "Also, the kinematic models of the small-scale robots can be improved by utilizing the constant curvature (CC) approximations [10] and finite element analysis (FEA) methods [11]." What about the analytic solutions presented in [27 of submission 1301]? [1] A. Cully, J. Clune, D. Tarapore, and J.-B. Mouret, “Robots that can adapt like animals,” Nature, vol. 521, no. 7553, pp. 503–507, May 2015, doi: 10.1038/nature14422. [2]S. Kriegman, S. Walker, D. Shah, M. Levin, R. Kramer-Bottiglio, and J. Bongard, “Automated shapeshifting for function recovery in damaged robots,” in Robotics: Science and Systems, Freiburg im Breisgau, Germany, 2019. [3]C. Paul, F. J. Valero-Cuevas, and H. Lipson, “Design and control of tensegrity robots for locomotion,” IEEE Transactions on Robotics, vol. 22, no. 5, pp. 944–957, Oct. 2006, doi: 10.1109/TRO.2006.878980. [4]F. Sehnke, C. Osendorfer, T. Rückstieß, A. Graves, J. Peters, and J. Schmidhuber, “Parameter-exploring policy gradients,” Neural Networks, vol. 23, no. 4, pp. 551–559, May 2010, doi: 10.1016/j.neunet.2009.12.004. [5]F. Flacco, T. Kröger, A. D. Luca, and O. Khatib, “A depth space approach to human-robot collision avoidance,” in 2012 IEEE International Conference on Robotics and Automation, 2012, pp. 338–345, doi: 10.1109/ICRA.2012.6225245. ## 3. Comments on Multimedia (Videos, etc.) + scale bar + consistent sizing and viewing angle - lighting (video 1 is appropriate, Video 2 is darker, video 3 is very dark) - None of the video speeds make sense, which diminishes the value of the multimedia. Video 1 has different speeds, and it is unclear why each speed was chosen. For example, why speed up the top row? It already appears jumpy, and speeding it up just makes it look more jumpy. For clarity, I recommend putting all gaits on the same 1x speed, since none of the speed adjustments appear justified. The difference in speed between video 1 and 2 is confusing to compare "optimal" with the others Video 2 should have all robots at the same speed in order for the comparison to be more meaningful. Same comment for Video 3. Why did you slow down the parameters which were optimized for the smooth surface? It seems if you're trying to compare the parameters that were optimized for smooth vs. rough, on the same rough surface... that you'd put both at the same speed.