Employing CNNs for an end-to-end reconstruction of the indoor scenes through camera relocalization, through PoseNet, and depth estimation, through multi-scale fully convolutional network, from a single RGB image during inference and registering the 3D reconstructed patches through iterative closest point algorithm. A portion of the dataset collected during the project is also released.