*Valentin Peretroukhin, Matthew Giamou, David M. Rosen, W. Nicholas Greene, Nicholas Roy, Jonathan Kelly*

Accurate rotation estimation is at the heart of robot perception tasks such as visual odometry and object pose estimation. Deep neural networks have provided a new way to perform these tasks, and the choice of rotation representation is an important part of network design. In this work, we present a novel symmetric matrix representation of the 3D rotation group, SO(3), with two important properties that make it particularly suitable for learned models: (1) it satisfies a smoothness property that improves convergence and generalization when regressing large rotation targets, and (2) it encodes a symmetric Bingham belief over the space of unit quaternions, permitting the training of uncertainty-aware models. We empirically validate the benefits of our formulation by training deep neural rotation regressors on two data modalities. First, we use synthetic point-cloud data to show that our representation leads to superior predictive accuracy over existing representations for arbitrary rotation targets. Second, we use image data collected onboard ground and aerial vehicles to demonstrate that our representation is amenable to an effective out-of-distribution (OOD) rejection technique that significantly improves the robustness of rotation estimates to unseen environmental effects and corrupted input images, without requiring the use of an explicit likelihood loss, stochastic sampling, or an auxiliary classifier. This capability is key for safety-critical applications where detecting novel inputs can prevent catastrophic failure of learned models.

Start Time | End Time | |
---|---|---|

07/14 15:00 UTC | 07/14 17:00 UTC |

The representation of rotations via continuous differentiable forms is essential for pose estimation methods based on learnable differentiable maps, such as deep neural networks. This paper provides a valuable contribution by providing a method for learning rotations which should be easily implementable with any modern deep learning framework. Theoretical results and empirical evidence strengthen the contribution. The paper, however, has a few issues, which I highlight below. Major issues: - A clearer distinction and discussion of advantages/disadvantages of the proposed approach with respect to the work in [42] is needed. Despite the simplicity of Problem 3 and its solution when compared to the formulations in [42, Sec. 4.2], the empirical gains seem to be quite marginal in the experiments to justify a 10D rather than 6D representation for rotations in SO(3). Minor issues: - There are a few acronyms and math symbols which should be defined before their first mention in the text. Examples (and my guess): QCQP (quadratically constrained quadratic program), SO(n) (special orthogonal group), \mathbb{S}^4 (real symmetric matrices?), S^3 (3D sphere), \mathbb{RP}^n (real projection), the Log and norm in Eq. 22, which is ambiguous, etc. Despite some of these terms and notation being common in some specific fields, RSS is still a general robotics conference with a broad audience. A paragraph defining the main mathematical spaces and the associated notation, as in [42, Sec. 3], should suffice. - The claim "most approaches to regressing rotations using data-driven models are unable to effectively model uncertainty" is slightly strong and lacks a citation. - In Eq. 25/26, "f" as defined by Problem 3 should map to a rotation matrix C^*, not the corresponding quaternion q^*. - A comparison in terms of measured computation time is missing. It should clarify whether the extra computations required to solve Problem 3 add too much overhead when compared to the method proposed by [42]. Other details: - Some sentences are quite long, extending for more than 4 lines, and should be broken up/revised. - Why not "\mathbf{I}" instead of "\mathbf{1}" to denote the identity matrix? The latter symbol causes confusion with a vector/matrix of ones.

This paper is very well-written. It builds upon the work in reference [42] significantly with the technical novelties being the connection to the Bingham distribution for uncertainty estimation and out-of-distribution detection. The experiments are impressive. Some minor comments: 1. It is a bit hard to judge how difficult it is to train the matrix A(theta) and how well it generalizes across datasets which would be a highly desirable quality for a rotation estimation method. 2. Perhaps the authors could also consider incorporating this approach in a typical visual-inertial-odometry (VIO) system

The authors present a novel rotation estimation model for robot perception tasks. They claim that their model improves convergence on large rotation targets, it is singularity-free and it is robust against uncertainties or corrupted images, providing experiments with synthetic and real data. They are going to publish the code with the paper for reproducibility. Minor comments: - It would be interesting to explain a little bit more how the method avoids singularities - Quadratically- Constrained Quadratic Program (QCQP) is mentioned for the firt time in page 3, but QCQP appears in previous pages.