Stable Diffusion's New Version Adds Image-to-Image Generation
SDXL 0.9 generates hyper-realistic images that could be used for film, TV and industry.
At a Glance
- Stable Diffusion gets an upgrade with SDXL 0.9, which adds image-to-image generation and other capabilities.
- SDXL 0.9 runs on consumer hardware but can generate "improved image and composition detail," the company said.
Stability AI has released the latest version of Stable Diffusion that adds image-to-image generation and other capabilities, changes that it said "massively" improve upon the prior model.
With SDXL 0.9, the text-to-image generator is now also an image-to-image generator, meaning users can use an image as a prompt to generate another image. The newest version also enables inpainting, where it can fill in missing or damaged parts of an image, and outpainting, which extends an existing image.
SDXL 0.9 is a follow-on from Stable Diffusion XL, released in beta in April.
Stability said its latest release can generate “hyper-realistic creations for films, television, music, and instructional videos, as well as offering advancements for design and industrial use.”
Left: SDXL Beta. Right: SDXL 0.9. Prompt: *~aesthetic~*~ manicured hand holding up a take-out coffee, pastel chilly dawn beach Instagram film photography. Negative prompt: 3d render, smooth, plastic, blurry, grainy, low-resolution, anime Credit: Stability AI
SXDL 0.9 availability
SDXL 0.9 can be accessed via ClipDrop, Stability’s online platform.
Stability AI API and DreamStudio customers can access the model starting this week. Users of popular image image-generating tools like NightCafe can also use the model.
Research weights for SDXL 0.9 are available now for “a limited period” to collect feedback and fully refine the model ahead of a general open release, expected in mid-July. Researchers need to apply for access via Hugging Face using an academic email.
The code to run it will be publicly available on Stability’s GitHub page.
What’s under the hood?
To power SDXL 0.9, Stability gave Stable Diffusion a power boost, increasing its parameter count.
SDXL 0.9 now boasts a 3.5 billion parameter base model and a 6.6 billion parameter model ensemble pipeline. In comparison, the beta version of Stable Diffusion XL ran on 3.1 billion parameters using just a single model.
The addition of the second model to SDXL 0.9 was meant to add finer details to the generated output of the first stage.
SDXL 0.9 runs on two CLIP models, including one of the largest OpenCLIP models trained to date, OpenCLIP ViT-G/14. This large model increases SDXL 0.9’s processing power and ability to create realistic imagery with greater depth and a higher resolution, according to Stability.
Stability said it will release further details on specifications and testing of the model soon.
SDXL 0.9 example generations. Stability said its latest model could be applied to 'industrial uses' as well as in TV and film. Credit: Stability AI
SDXL 0.9 system requirements
SDXL 0.9 can run on most consumer-available hardware. Stability recommends users have 16GB of RAM and an Nvidia GeForce RTX 20 graphics card (equivalent or higher standard) – any graphics card with at least 8GB of VRAM.
SDXL 0.9 can run on Windows 10, 11 or Linux. Linux users should have a compatible AMD card with 16GB VRAM.
Read more about:
ChatGPT / Generative AIAbout the Author
You May Also Like