IIIT Hyderabad Publications |
|||||||||
|
Learning Effective Navigational Strategies for Active Monocular Simultaneous Localization and MappingAuthor: Vignesh Prasad Date: 2017-06-30 Report no: IIIT/TH/2017/36 Advisor:Madhava Krishna AbstractSimultaneous Localization and Mapping (SLAM) refers to the problem of mapping an unknown environment that the robot is operating in and localizing itself in the unknown environment at the same time. Out of the various methods of performing SLAM, using a single monocular camera as the sole sensory input is highly preferred due to its simplicity and low power consumption. Range sensors such as laser range finders, depth cameras etc require much more power to operate and performing SLAM with them is more computationally intensive as compared to SLAM with a single camera. However, when compared to trajectory planning methods using depth-based SLAM, Monocular SLAM in loop does need additional considerations. One main reason being that for a robust optimization of the map and robot trajectory, using Bundle Adjustment (BA) in the case of most monocular SLAM methods, the SLAM system needs to scan the area for a reasonable duration to gather more information about the area to improve the map and pose estimates. Additionally, due to the way monocular SLAM methods work, they do not tolerate large camera rotations between successive views and tend to breakdown. Other reasons for Monocular SLAM failure include ambiguities in decomposition of the Essential Matrix, feature-sparse scenes and more layers of non linear optimization apart from BA. Learning a complex task such as low-level robot manoeuvres while preventing failure of monocular SLAM is a challenging problem for both robots and humans. The data-driven identification of basic motion strategies in preventing monocular SLAM failure is a largely unexplored problem. In this thesis, a computational model is devised for representing and inferring strategies for the problem, formulated as a Markov Decision Process (MDP), where the reward function models the goal of the task as well as information about the strategy. Reinforcement Learning (RL) is used with an intuitive, handcrafted reward function to generates fail safe trajectories wherein the SLAM generated outputs (scene structure and camera motion) do not deviate largely from their true values. This model is expanded on by treating it as an expert and try to learn an underlying true reward function for the given task at hand using Inverse Reinforcement Learning (IRL). Quintessentially, the framework successfully learns the otherwise complex relation between motor actions and perceptual inputs and uses this knowledge to generate trajectories that do not cause failure of SLAM. It also learns how a few chosen parameters affect the task at hand. This complex relation is almost intractable to capture in an obvious mathematical formulation. The framework allows one to identify the way in which a few chosen parameters affect the quality of monocular SLAM estimates. The estimated reward function was able to capture expert demonstration information and the inherent expert strategy and it was possible to give an intuitive explanation to the obtained reward structure. Drastic improvements in the quality of the SLAM map and trajectory are systematically shown in simulations. Better performance is demonstrated over other methods based on maximizing scene overlap, supervised learning based methods and over state-of-the-art methods described in literature that select the next view from a candidate set of views based on the best score calculated according to SLAM quality measures. The proposed method scales effectively across various SLAM frameworks in real world experiments as well as in simulations with a mobile robot. Full thesis: pdf Centre for Robotics |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |