IIIT Hyderabad Publications |
|||||||||
|
Monocular Multilayer Layout Estimation for WarehousesAuthor: Anurag Sahu Date: 2023-03-10 Report no: IIIT/TH/2023/13 Advisor:Madhava Krishna AbstractWith the booming online market, managing the warehouse inventory is one of the most essential and challenging tasks. The management of inventory will be more efficient if it is automated using robots. The robots can work faster than humans, the robots work at a constant speed with no breaks, and do tasks in more repetition than humans like fetching inventory from warehouse. But for the robots to perform tasks like putting objects in racks, fetching objects from rack, re-organising the racks to make more space they need to have an understanding of the warehouse environment. To understand the warehouse environment the robots needs to create a 3D map of the warehouse consisting space to move for the robot as well as identify the racks and boxes so that its able to plan the execution of tasks. Towards solving this problem in this thesis we address problem of freespace estimation for rack shelves. Given a monocular RGB image captured from a camera mounted on a robotic arm. We aim to predict the Top-view and Front-view layouts so as to create a 3D reconstruction of rack and objects present in the Monocular RGB image. We propose a simple yet effective network architecture RackLay, which takes a monocular RGB image as input and outputs the Top-view and Front-view layout of all the shelves comprising the rack visible in the image. The Network can learn two kinds of layout representations, one in the canoncial frame centered on the shelf, called the shelf-centeric layout and the other in a frame with respect to the camera, called the ego-centric layout. Apart from portraying the versatility of the network, they lend to various useful applications. Since there are very few publicly available datasets for warehouse settings, we also introduce the synthetic data generation pipeline termed as WareSynth, which can be used to generate 3D warehouse scenes, automate the process of data capture and generate corresponding annotations. WareSynth can be used for various tasks such as 2D/3D object detection, semantic/instance segmentation, layout estimation, 3D scene navigation and mapping, 3D reconstruction etc. The same pipeline can also be modified to other kind of scenes such as supermarkets, greenhouses by changing the database of objects and placement parameters hence this pipeline open gates for further research in similar environments. Full thesis: pdf Centre for Robotics |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |