Neuralangelo: Unleashing the digital Michelangelo from your smartphone
Until recently, 3D surface reconstruction has been a relatively slow, painstaking process involving significant trial and error and manual input. But what if you could take a video of an object or scene with your smartphone ...
A joint project by researchers in the Whiting School of Engineering's Department of Computer Science and tech giant NVIDIA, this high-fidelity neural surface reconstruction algorithm can precisely render the shapes of everyday objects, famous statues, familiar buildings, and entire environments from only a smartphone video or drone footage with no extra input necessary. Their findings have been presented on the pre-print server arXiv.
The algorithms that power virtual reality environments, autonomous robot navigation, and smart operating rooms all have one fundamental requirement: They need to be able to process and accurately interpret information from the real world to work correctly. This kind of knowledge is achieved through 3D surface reconstruction, in which an algorithm takes multiple 2D images from different viewpoints to render real-life environments in a way that other programs can recognize and manipulate.
The Neuralangelo project was initiated by Zhaoshuo "Max" Li—who earned a master's degree in computer science from the Whiting School in 2019, followed by a Ph.D. in computer science in 2023—during his internship in the summer of 2022 at NVIDIA, where he is now a research scientist. His goal was not only to enhance existing 3D reconstruction techniques but also to make them accessible to anyone with a smartphone.
"How can we acquire the same understanding as humans of a 3D environment by using cheaply available videos, thereby making this technology accessible to everyone?" he asked.
From left to right, Michelangelo’s sculpture, “David,” Neuralangelo’s normal map, and 3D mesh surface output. Credit: NVIDIA