Date: | Friday, Apr. 26 |
---|---|
Time: | 16:00 |
Location: | Online Call via Zoom |
Our guest speaker is Jason Y. Zhang from Carnegie Mellon University.
You are all cordially invited to the CVG Seminar on April 26th at 4 pm CEST
Reconstructing 3D scenes and objects from images alone has been a long-standing goal in computer vision. However, typical methods require a large number of images with precisely calibrated camera poses, which is cumbersome for end users. We propose a probabilistic framework that can predict distributions over relative camera rotations. These distributions are then composed into coherent camera poses given sparse image sets. To improve precision, we then propose a diffusion-based model that represents camera poses as a distribution over rays instead of camera extrinsics. We demonstrate that our system is capable of recovering accurate camera poses from a variety of self-captures and is sufficient for high-quality 3D reconstruction.
Jason Y. Zhang is a final-year PhD student at Carnegie Mellon University, advised by Deva Ramanan and Shubham Tulsiani. Jason completed his undergraduate degree at UC Berkeley, where he worked with Jitendra Malik and Angjoo Kanazawa. He is interested in scaling single-view and multi-view 3D to unconstrained environments. Jason is supported in part by the NSF GRFP.