Peng Yin (CMU), Lingyun Xu (CMU), Ji Zhang (CMU), Howie Choset (CMU), Sebastian Scherer (CMU) |
|
Paper #027 |
Interactive Poster Session V | Interactive Poster Session VIII |
We present a method for localizing a single camera with respect to a point cloud map in indoor and outdoor scenes. The problem is challenging because correspondences of local invariant features are inconsistent across the domains between image and 3D. The problem is even more challenging as the method must handle various environmental conditions such as illumination, weather, and seasonal changes. Our method can match equirectangular images to the 3D range projections by extracting cross-domain symmetric place descriptors. Our key insight is to retain condition-invariant 3D geometry features from limited data samples while eliminating the condition-related features by a designed Generative Adversarial Network. Based on such features, we further design a spherical convolution network to learn viewpoint-invariant symmetric place descriptors. We evaluate our method on extensive self-collected datasets, which involve Long-term (variant appearance conditions), Large-scale (up to 2km structure/unstructured environment), and Multistory (four-floor confined space). Our method surpasses other current state-of-the-arts by achieving around 3 times higher place retrievals to inconsistent environments, and above 3 times accuracy on online localization. To highlight our method’s generalization capabilities, we also evaluate the recognition across different datasets. With one single trained model, i3dLoc can achieve reliable visual localization under random conditions and viewpoints.