Exploiting Semantic Information for Accurate Segmentation, Localization in Dynamic Environments

Author: dinesh reddy Narapureddy
Date: 2016-06-29
Report no: IIIT/TH/2016/32
Advisor:Madhava Krishna

Abstract

Image based reconstruction of urban environments is a challenging problem that deals with optimization of large number of variables, and has several sources of errors like the presence of dynamic objects. Since most large scale approaches make the assumption of observing static scenes, dynamic objects are relegated to the noise modelling section of such systems. This is an approach of convenience since the RANSAC based framework used to compute most multiview geometric quantities for static scenes nat- urally confine dynamic objects to the class of outlier measurements. However, reconstructing dynamic objects along with the static environment helps us get a complete picture of an urban environment. Such understanding can then be used for important robotic tasks like path planning for autonomous navigation, obstacle tracking and avoidance, and other areas. While the literature has been fairly dense in the areas of scene understanding and semantic labeling there have been few works that make use of motion cues to embellish semantic performance and vice versa. we address the problem of semantic motion segmentation, and show how semantic and motion priors augments performance. We propose an algorithm that jointly infers the semantic class and motion labels of an object. Integrating semantic, geometric and optical flow based constraints into a dense CRF model we infer both the object class as well as motion class, for each pixel. We found improvement in performance using a fully connected CRF as compared to a standard clique-based CRFs. For inference, we use a Mean Field approximation based algorithm. We also propose a system for robust SLAM that works in both static and dynamic environments. To overcome the challenge of dynamic objects in the scene, we propose a new model to incorporate semantic constraints into the reconstruction algorithm. While some of these constraints are based on multi-layered dense CRFs trained over appearance as well as motion cues, other proposed constraints can be expressed as additional terms in the bundle adjustment optimization process that does iterative refinement of 3D structure and camera / object motion trajectories. Our method outperforms recently proposed motion detection algorithms and also improves the semantic labeling compared to the state-of-the-art Automatic Labeling Environment algorithm on the challenging KITTI dataset especially for object classes such as pedestrians and cars that are critical to an outdoor robotic navigation scenario. We show results for accuracy of motion segmentation and reconstruction of the trajectory and shape of moving objects relative to ground truth. We are able to show average relative error reduction by 41% for moving object trajectory reconstruction relative to state-of-the-art methods like TriTrack[23], as well as on standard bundle adjustment algorithms with motion segmentation.

Full thesis: pdf

Centre for Robotics

IIIT Hyderabad Publications

Exploiting Semantic Information for Accurate Segmentation, Localization in Dynamic Environments

Abstract