Explore Google's latest work in generative AI that requires no training on prior 3D model data

Ben Wodecki, Jr. Editor

October 5, 2022

3 Min Read
Google DreamFusion generates 3D objects with high-fidelity appearance and depth.
Google DreamFusion generates 3D objects with high-fidelity appearance and depth.Google

Researchers from Google are the latest to unveil a generative AI tool capable of turning text prompts into digital 3D representations.

Dubbed DreamFusion, the AI-powered tool can generate 3D models of text inputs.

DreamFusion is an expanded version of Dream Fields, a generative 3D system Google unveiled back in 2021. This latest release, however, requires no prior training – meaning DreamFusion can generate 3D representations of objects without 3D data.

Instead, the system uses 2D images of an object generated by the Imagen text-to-image diffusion model to understand different perspectives of the model it is trying to generate.

According to Google’s AI researchers, the resulting 3D model “can be viewed from any angle, relit by arbitrary illumination, or composited into any 3D environment.”

“Given a caption, DreamFusion generates relightable 3D objects with high-fidelity appearance, depth, and normal,” according to a breakdown of the project.

DreamFusion: How does it work?

Google’s team proposed the concept of Score Distillation Sampling (SDS) – a way of generating samples from a diffusion model by optimizing a loss function.

“SDS allows us to optimize samples in an arbitrary parameter space, such as a 3D space, as long as we can map back to images differentiably,” they explained.

Google’s researchers then used a 3D scene parameterization similar to Neural Radiance Fields, or NeRFs, to define the differentiable mapping of a model.

“SDS alone produces reasonable scene appearance, but DreamFusion adds additional regularizers and optimization strategies to improve geometry. The resulting trained NeRFs are coherent, with high-quality normals, surface geometry and depth, and are relightable with a Lambertian shading model.”

Here’s a breakdown:

Step 1) Type in your prompt. The example Google offered was ‘a DSLR photo of a peacock on a surfboard.’

3361.jpg

Step 2) Apply the Imagen model to create various 2D angles of the prospective model to predict potential issues that would affect the model quality.

3394.jpg

Step 3) Apply a 3D scene parameterization such as NerF to further optimize the image. Repeat this action to get the best results.

3426.jpg

Step 4) The result is a 3D representation of a peacock on a surfboard. You can now export this as a mesh – using the file formats STL or PLY – for use in another scene or project.

3458.jpg

For a more in-depth explainer, Google’s paper outlining DreamFusion is available via arXiv.

More ways to generate peacocks on surfboards

Dreamfusion follows a host of generative AI tools showcased in the past few weeks, with OpenAI’s DALL-E inviting interest in the concept of generating objects from text prompts.

DALL-E was followed by other text-to-image engines, including Midjourney and Stable Diffusion in rising to public knowledge.

The interest saw the launch of PromptBase, an online marketplace platform giving users the ability to purchase prompts to generate desired images.

The U.S. Copyright Office even granted protection to an AI-generated work. But not everyone is enamored with these newfound artworks, several online platforms, including heavyweight Getty Images, have barred AI-generated content from their sites.

Interest in generative AI is not limited to images, either. Facebook parent Meta recently unveiled Make-A-Video, an AI system capable of generating videos from text prompts.

Read more about:

ChatGPT / Generative AI

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like