Sayan Deb Sarkar

I am a Computer Vision Research Engineer at Mercedes Benz Research and Development India, where I work on driver monitoring systems in the Intelligent Interior Assist Team of MBUX Interior Assist Programme in the Maybach S-Class series.

Before moving to Benz, I spent around 1.5 years as a Research Assistant with Prof. Vincent Lepetit at the Institute of Computer Graphics and Vision, Technical University of Graz, Austria. My work was majorly focused on hand pose estimation from RGB/RGB-D images. I also worked closely with Shreyas Hampali, Sinisa Stekovic and Dr. Mahdi Rad on indoor scene understanding and room layout estimation problems.

I am interested in computer vision, machine learning, optimization and image processing. My research focus is at the intersection of deep learning and 3D multi-view geometry, especially with impact in robotic applications.

Email  /  CV  /  Google Scholar  /  Twitter  /  Github  /  LinkedIn

profile photo
Recent News

Jul 2021 : HO-3D Version 3 is out. Visit our project page to download and check out the report on arXiv!
May 2021 : Moved back home! Started at Mercedes-Benz R & D India Pvt. Ltd.
Feb 2021 : Monte Carlo Scene Search For 3D Scene Understanding accepted at CVPR 2021!
Jul 2020 : Paper on 3D Room Layout Estimation accepted at ECCV 2020!
Jul 2020 : Graduated From Manipal Institute of Technology with a B.Tech in Information Technology. Thesis available here!
Jan 2020 : Moved to Graz, Austria! Started at ICG, TU Graz

HandsFormer: Keypoint Transformer for Monocular 3D Pose Estimation of Hands and Object in Interaction
Shreyas Hampali, Sayan Deb Sarkar, Mahdi Rad, Vincent Lepetit


We propose an efficient network architecture for estimating pose of two hands and object during complex interaction. We also release the challenging H2O-3D dataset, which contains two hands interacting with YCB objects.

Monte Carlo Scene Search for 3D Scene Understanding
Shreyas Hampali*, Sinisa Stekovic*, Sayan Deb Sarkar, Chetan Srinivasa Kumar, Friedrich Fraundorfer, Vincent Lepetit
IEEE/CVF Computer Vision and Pattern Recognition (IEEE/CVF CVPR), 2021
arXiv / Project Page / Video / Code

We propose a Monte-Carlo Tree Search (MCTS) based analysis-by-synthesis method to recover complete scene (3D layout+objects) from a RGB-D scan of the environment.
*Equal contribution

General 3D Room Layout from a Single View by Render-and-Compare.
Sinisa Stekovic, Shreyas Hampali, Mahdi Rad, Sayan Deb Sarkar, Friedrich Fraundorfer, Vincent Lepetit
European Conference on Computer Vision (ECCV), 2020
arXiv / Project Page / Code

We propose an analysis-by-synthesis method to estimate a 3D layout of the room - walls, floors, ceilings - from a single perspective view. The method recovers complex non-cubiod layouts by solving a constrained discrete optimization problem.

Design and code from Jon Barron's website