Stanford’s Spellburst: Editing Generated Images Through Code

GPT-4-powered large language model helps solve an artist's problem of not being able to fine-tune a generated image

September 18, 2023

2 Min Read

Created using Stable Diffusion (Prompt: Spellburst, Digital Art, Abstract, Dynamic Lighting, Intricate Detail, Summer Vibrancy)

At a Glance

Stanford's Spellburst is a large language model that generates images from text, whose code can be edited for fine-tuning.

Researchers at Stanford University and Replit have unveiled Spellburst, a new large language model whose generated images can be edited by revising lines of code.

Spellburst allows artists to input an initial natural language prompt and then once an image is generated, edit the output’s code to create more refined outputs – effectively giving users fine-tuning for image generation.

Spellburt users could use the prompt ‘a stained glass image of a beautiful, bright bouquet of roses´ and if the flowers were too pink or the stained glass looked off-color, open a panel of dynamic sliders generated using the previous prompt to change any aspect of the image. Users could even merge different versions of outputs.

Spellburt was built with the large language model GPT-4 from OpenAI

The team at Stanford built the model after interviewing digital artists who expressed creative frustrations. Stanford's team said Spellburst can speed up the time-consuming and difficult process of coding art.

“A large language model can give you a good starting point,” said Hariharan Subramonyam, assistant professor at the Graduate School of Education and a faculty fellow at the Stanford Institute for Human-Centered AI, in a blog post.

“But when the artist wants to explore different textures, different colors or patterns, at that point they want finer control, which large language models can’t provide. Spellburst essentially helps artists seamlessly switch between the semantic space and the code.”

When building Spellburst, Stanford researchers interviewed creative coders on how they develop their concepts, creative workflows and challenges. Expert generative artists were later allowed to test Spellburst.

“The feedback was overall very positive,” Subramonyam said. “The large language model helps artists bridge from semantic space to code faster, but it also helps them explore many different variations and take larger creative leaps.”

Spellburst does have its limitations – it does not always get prompts right and in some instances merging versions caused issues. Stanford's use of a small sample of artists providing feedback also did not represent the full generative artist community, the team behind the model noted.

When to access Spellburst large language model

Spellburst is not accessible at the time of writing. The team behind it is studying it further.

Stanford is planning on launching the tool as open source “later this year.”

A detailed breakdown of the Spellburst model can be found in the paper: Spellburst: A Node-based Interface for Exploratory Creative Coding with Natural Language Prompts, authored by Tyler Angert from Replit and Stanford's Miroslav Ivan Suzara, Jenny Han, Christopher Lawrence Pondoc and Subramonyam.

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

See more from Ben Wodecki

Related Topics

Recent in ML

Related Topics

Recent in NLP

Related Topics

Recent in Data

Related Topics

Recent in Automation

Related Topics

Recent in Verticals

Related Topics

Recent in Responsible AI

Related Topics

Recent in Companies

Related Topics

Stanford’s Spellburst: Editing Generated Images Through Code

At a Glance

When to access Spellburst large language model

About the Author(s)

Latest News

Trending articles