Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!
December 8, 2023
AI researchers from TikTok parent ByteDance have developed an AI model capable of producing impressive 3D models from images.
Dubbed ImageDream, the model can generate multi-view diffusions of objects from any viewpoint when inputting a single image.
For example, suppose you input a picture of a bulldog wearing a black pirate hat. In that case, ImageDream generates multiple views of the object and then uses those multiple views to create a 3D model.
The team behind ImageDream contends that using images as inputs to generate 3D models rather than text “allows for a more intuitive and direct way for users to communicate their desired outcomes, particularly for those who may find it challenging to articulate their visions textually.”
AI 3D generation models existed well before ImageDream. The ImageDream team even used the website temple for one of the most notable models, Google DreamFusion, unveiled last October. OpenAI even has its own AI 3D generation model, Point-E, which can generate 3D sculpts from text inputs.
TikTok parent ByteDance even built a 3D generation model prior to ImageDream: MVDream. Published in August, this diffusion model can generate high-quality 3D renderings from text inputs. Built in partnership with the University of California, San Diego, MVDream can be fine-tuned for personalized 3D generation using tools like DreamBooth3D.
According to the team that built ImageDream, however, this latest model stands out compared to prior systems in that it can generate objects with correct geometry from a given image, “enabling users to leverage well-developed image generation models for better image-text alignment than purely text-conditioned models like MVDream.”
The paper reads: “ImageDream surpasses existing state-of-the-art (SoTA) zero-shot single image 3D model generators, such as Magic123, in terms of geometry and texture quality.”
ImageDream, like most models, has its limitations – including issues with image constraints. For example, when it comes to minute details, like on faces of a full-body avatar, the model struggles to capture nuances.
Using AI for 3D generation is a growing concept – but use cases could see models like ImageDream used to generate assets for VR or AR environments or video games. Among example generations, ImageDream was used to create models for katanas and AK47 – objects found in various video game titles, as well as Pokémon mascot Pikachu wearing a hat.
You can explore various ImageDream generates on ByteDance’s project page.
At the time of writing, there appears to be an issue with accessing code for ImageDream on its project page. AI Business has contacted the authors for clarification.
Read more about:ChatGPT / Generative AI
Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.
You May Also Like