Elran Mizrahi, M.Sc thesis

Object Identification, Segmentation and Dimension Extraction from RGBD Images using Deep Learning

In recent years, the need for computer vision algorithms for object identification in 3D scenes has reached an all-time high. The combination of deep learning algorithms with CV improved the latter’s object identification processes and allowed to apply them on complex scenes. In addition, these algorithms have proven to be more robust than traditional CV algorithms, especially when applied to new scenes.

This work aims at taking advantage of such networks by applying them on a synthetic RGBD (depth images) database, consisting of geometric primitives in different sizes, locations, orientations, and amounts. A synthetic database for geometric primitives was created in a previous work in our lab by Gal Shitrit. The proposed network will identify, segment, and extract the principal dimensions of the objects in the scenes.

The aim is to improve the desired accuracy for object identification in complex 3D scenes. It has been shown that by pre-processing the input data of the neural network the desired output accuracy is improved as well. In this work, the improvement will be applied as a pre-processing stage of the input images. Thus, it is possible to create a richer feature vector for the CNN to study which can result in higher identification accuracy.