• Latest
  • Trending
A brief history of diffusion, the tech at the heart of modern image-generating AI

A brief history of diffusion, the tech at the heart of modern image-generating AI

December 22, 2022
Daily Crunch: Cell network provider Google Fi confirms customer data breach

Daily Crunch: Cell network provider Google Fi confirms customer data breach

January 31, 2023
OpenAI released its AI-written text identifier. Here’s how to use it.

OpenAI released its AI-written text identifier. Here’s how to use it.

January 31, 2023
The Academy will not revoke Andrea Riseborough’s Oscar nomination

The Academy will not revoke Andrea Riseborough’s Oscar nomination

January 31, 2023
Commission DG inspects lottery facility ahead national games

Commission DG inspects lottery facility ahead national games

January 31, 2023
N300m fraud: Ex-PDP chair Mohammed paralysed, says counsel

N300m fraud: Ex-PDP chair Mohammed paralysed, says counsel

January 31, 2023
BVAS, over-voting and Osun governorship tribunal verdict

BVAS, over-voting and Osun governorship tribunal verdict

January 31, 2023

DOJ requests Autopilot, FSD documents from Tesla

January 31, 2023
OpenAI releases tool to detect AI-generated text, including from ChatGPT

OpenAI releases tool to detect AI-generated text, including from ChatGPT

January 31, 2023
From metal to liquid, this shapeshifting robot can escape a cage

From metal to liquid, this shapeshifting robot can escape a cage

January 31, 2023
250 tech innovators gather in Kwara for hackathon to improve govt operations

250 tech innovators gather in Kwara for hackathon to improve govt operations

January 31, 2023
Obi campaign team condemns Kano attack on Buhari

Obi campaign team condemns Kano attack on Buhari

January 31, 2023
Rivers APC campaign director abducted, party blames police

Rivers APC campaign director abducted, party blames police

January 31, 2023
  • Home
  • About Us
  • Trainings
  • Contact Us
  • Privacy Policy
  • Advertise
  • Submit Your Post
Tuesday, January 31, 2023
  • Login
SNN Africa - Breaking News, World News, Entertainment, Sport, Politics, and Business
No Result
View All Result
No Result
View All Result
SNN Africa - Breaking News, World News, Entertainment, Sport, Politics, and Business
No Result
View All Result
Home News

A brief history of diffusion, the tech at the heart of modern image-generating AI

December 22, 2022
in News, Tech
Reading Time: 6 mins read
0
A brief history of diffusion, the tech at the heart of modern image-generating AI
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter
ADVERTISEMENT

Text-to-image AI exploded this year as technical advances greatly enhanced the fidelity of art that AI systems could create. Controversial as systems like Stable Diffusion and OpenAI’s DALL-E 2 are, platforms including DeviantArt and Canva have adopted them to power creative tools, personalize branding and even ideate new products.

But the tech at the heart of these systems is capable of far more than generating art. Called diffusion, it’s being used by some intrepid research groups to produce music, synthesize DNA sequences and even discover new drugs.

So what is diffusion, exactly, and why is it such a massive leap over the previous state of the art? As the year winds down, it’s worth taking a look at diffusion’s origins and how it advanced over time to become the influential force that it is today. Diffusion’s story isn’t over — refinements on the techniques arrive with each passing month — but the last year or two especially brought remarkable progress.

The birth of diffusion

You might recall the trend of deepfaking apps several years ago — apps that inserted people’s portraits into existing images and videos to create realistic-looking substitutions of the original subjects in that target content. Using AI, the apps would “insert” a person’s face — or in some cases, their whole body — into a scene, often convincingly enough to fool someone on first glance.

Most of these apps relied on an AI technology called generative adversarial networks, or GANs for short. GANs consist of two parts: a generator that produces synthetic examples (e.g. images) from random data and a discriminator that attempts to distinguish between the synthetic examples and real examples from a training dataset. (Typical GAN training datasets consist of hundreds to millions of examples of things the GAN is expected to eventually capture.) Both the generator and discriminator improve in their respective abilities until the discriminator is unable to tell the real examples from the synthesized examples with better than the 50% accuracy expected of chance.

Sand sculptures of Harry Potter and Hogwarts, generated by Stable Diffusion. Image Credits: Stability AI

Top-performing GANs can create, for example, snapshots of fictional apartment buildings. StyleGAN, a system Nvidia developed a few years back, can generate high-resolution head shots of fictional people by learning attributes like facial pose, freckles and hair. Beyond image generation, GANs have been applied to the 3D modeling space and vector sketches, showing an aptitude for outputting video clips as well as speech and even looping instrument samples in songs.

In practice, though, GANs suffered from a number of shortcomings owing to their architecture. The simultaneous training of generator and discriminator models was inherently unstable; sometimes the generator “collapsed” and outputted lots of similar-seeming samples. GANs also needed lots of data and compute power to run and train, which made them tough to scale.

Enter diffusion.

How diffusion works

Diffusion was inspired by physics — being the process in physics where something moves from a region of higher concentration to one of lower concentration, like a sugar cube dissolving in coffee. Sugar granules in coffee are initially concentrated at the top of the liquid, but gradually become distributed.

Diffusion systems borrow from diffusion in non-equilibrium thermodynamics specifically, where the process increases the entropy — or randomness — of the system over time. Consider a gas — it’ll eventually spread out to fill an entire space evenly through random motion. Similarly, data like images can be transformed into a uniform distribution by randomly adding noise.

Diffusion systems slowly destroy the structure of data by adding noise until there’s nothing left but noise.

In physics, diffusion is spontaneous and irreversible — sugar diffused in coffee can’t be restored to cube form. But diffusion systems in machine learning aim to learn a sort of “reverse diffusion” process to restore the destroyed data, gaining the ability to recover the data from noise.

Image Credits: OpenBioML

Diffusion systems have been around for nearly a decade. But a relatively recent innovation from OpenAI called CLIP (short for “Contrastive Language-Image Pre-Training”) made them much more practical in everyday applications. CLIP classifies data — for example, images — to “score” each step of the diffusion process based on how likely it is to be classified under a given text prompt (e.g. “a sketch of a dog in a flowery lawn”).

At the start, the data has a very low CLIP-given score, because it’s mostly noise. But as the diffusion system reconstructs data from the noise, it slowly comes closer to matching the prompt. A useful analogy is uncarved marble — like a master sculptor telling a novice where to carve, CLIP guides the diffusion system toward an image that gives a higher score.

OpenAI introduced CLIP alongside the image-generating system DALL-E. Since then, it’s made its way into DALL-E’s successor, DALL-E 2, as well as open source alternatives like Stable Diffusion.

What can diffusion do?

So what can CLIP-guided diffusion models do? Well, as alluded to earlier, they’re quite good at generating art — from photorealistic art to sketches, drawings and paintings in the style of practically any artist. In fact, there’s evidence suggesting that they problematically regurgitate some of their training data.

But the models’ talent — controversial as it might be — doesn’t end there.

Researchers have also experimented with using guided diffusion models to compose new music. Harmonai, an organization with financial backing from Stability AI, the London-based startup behind Stable Diffusion, released a diffusion-based model that can output clips of music by training on hundreds of hours of existing songs. More recently, developers Seth Forsgren and Hayk Martiros created a hobby project dubbed Riffusion that uses a diffusion model cleverly trained on spectrograms — visual representations — of audio to generate ditties.

Beyond the music realm, several labs are attempting to apply diffusion tech to biomedicine in the hopes of uncovering novel disease treatments. Startup Generate Biomedicines and a University of Washington team trained diffusion-based models to produce designs for proteins with specific properties and functions, as MIT Tech Review reported earlier this month.

The models work in different ways. Generate Biomedicines’ adds noise by unraveling the amino acid chains that make up a protein and then puts random chains together to form a new protein, guided by constraints specified by the researchers. The University of Washington model, on the other hand, starts with a scrambled structure and uses information about how the pieces of a protein should fit together provided by a separate AI system trained to predict protein structure.

Image Credits: PASIEKA/SCIENCE PHOTO LIBRARY/Getty Images

They’ve already achieved some success. The model designed by the University of Washington group was able to find a protein that can attach to the parathyroid hormone — the hormone that controls calcium levels in the blood — better than existing drugs.

Meanwhile, over at OpenBioML, a Stability AI-backed effort to bring machine learning-based approaches to biochemistry, researchers have developed a system called DNA-Diffusion to generate cell-type-specific regulatory DNA sequences — segments of nucleic acid molecules that influence the expression of specific genes within an organism. DNA-Diffusion will — if all goes according to plan — generate regulatory DNA sequences from text instructions like “A sequence that will activate a gene to its maximum expression level in cell type X” and “A sequence that activates a gene in liver and heart, but not in brain.”

What might the future hold for diffusion models? The sky may well be the limit. Already, researchers have applied it to generating videos, compressing images and synthesizing speech. That’s not to suggest diffusion won’t eventually be replaced with a more efficient, more performant machine learning technique, as GANs were with diffusion. But it’s the architecture du jour for a reason; diffusion is nothing if not versatile.

A brief history of diffusion, the tech at the heart of modern image-generating AI by Kyle Wiggers originally published on TechCrunch

Related

ShareTweetShare
Previous Post

Automotus raises $9M to scale automated curb management tech

Next Post

Retired Delta fire service officer allegedly rapes minor

Related Posts

Daily Crunch: Cell network provider Google Fi confirms customer data breach
News

Daily Crunch: Cell network provider Google Fi confirms customer data breach

January 31, 2023
0
OpenAI released its AI-written text identifier. Here’s how to use it.
News

OpenAI released its AI-written text identifier. Here’s how to use it.

January 31, 2023
0
The Academy will not revoke Andrea Riseborough’s Oscar nomination
News

The Academy will not revoke Andrea Riseborough’s Oscar nomination

January 31, 2023
0
Commission DG inspects lottery facility ahead national games
News

Commission DG inspects lottery facility ahead national games

January 31, 2023
0
N300m fraud: Ex-PDP chair Mohammed paralysed, says counsel
News

N300m fraud: Ex-PDP chair Mohammed paralysed, says counsel

January 31, 2023
0
BVAS, over-voting and Osun governorship tribunal verdict
News

BVAS, over-voting and Osun governorship tribunal verdict

January 31, 2023
0
Next Post
Retired Delta fire service officer allegedly rapes minor

Retired Delta fire service officer allegedly rapes minor

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

I agree to the Terms & Conditions and Privacy Policy.

Recent News

Daily Crunch: Cell network provider Google Fi confirms customer data breach

Daily Crunch: Cell network provider Google Fi confirms customer data breach

January 31, 2023
0
OpenAI released its AI-written text identifier. Here’s how to use it.

OpenAI released its AI-written text identifier. Here’s how to use it.

January 31, 2023
0
The Academy will not revoke Andrea Riseborough’s Oscar nomination

The Academy will not revoke Andrea Riseborough’s Oscar nomination

January 31, 2023
0
Commission DG inspects lottery facility ahead national games

Commission DG inspects lottery facility ahead national games

January 31, 2023
0
  • Trending
  • Comments
  • Latest
Check Out The Best Shawarma Spot In Port Harcourt

Check Out The Best Shawarma Spot In Port Harcourt

January 5, 2023
Ghana Cedis Depreciates Against The US Dollar, Other Currencies

Ghana Cedis Depreciates Against The US Dollar, Other Currencies

January 5, 2023
Pele Is Dead

Pele Is Dead

December 26, 2022
Bbnaija Adekunle

Bbnaija: I Couldn’t Afford £100 Pounds For Visa Fee – Adekunle

January 5, 2023
Check Out The Best Shawarma Spot In Port Harcourt

Check Out The Best Shawarma Spot In Port Harcourt

0
Charly Boy

APC Governors Are Group Of ‘Stunning Bandits’ – Charly Boy

0
Osinachi Nwachukwu

Osinachi: “The Church Wasn’t Aware Of Any Domestic Violence”- Paul Enenche

0
Champions League

Champions League: Full Fixtures For Confirmed Semi-Final Matches

0
Daily Crunch: Cell network provider Google Fi confirms customer data breach

Daily Crunch: Cell network provider Google Fi confirms customer data breach

January 31, 2023
OpenAI released its AI-written text identifier. Here’s how to use it.

OpenAI released its AI-written text identifier. Here’s how to use it.

January 31, 2023
The Academy will not revoke Andrea Riseborough’s Oscar nomination

The Academy will not revoke Andrea Riseborough’s Oscar nomination

January 31, 2023
Commission DG inspects lottery facility ahead national games

Commission DG inspects lottery facility ahead national games

January 31, 2023

Explore SNN Africa

SNN Africa is Africa’s most trusted news platform; a global media conglomerate, delivering verified and timely information to the world

Follow Us

Browse by Category

  • Business
  • Crime
  • Entertainment
  • Health
  • News
  • Politics
  • Sports
  • Tech
  • Travel
  • World News

Recent News

Daily Crunch: Cell network provider Google Fi confirms customer data breach

Daily Crunch: Cell network provider Google Fi confirms customer data breach

January 31, 2023
OpenAI released its AI-written text identifier. Here’s how to use it.

OpenAI released its AI-written text identifier. Here’s how to use it.

January 31, 2023
  • Home
  • About Us
  • Trainings
  • Contact Us
  • Advertise on SNN Africa
  • Privacy Policy

Copyright © 2022 SNN Africa | All rights Reserved.

No Result
View All Result
  • Home
  • About Us
  • Trainings
  • Contact Us
  • Privacy Policy
  • Advertise
  • Submit Your Post

Copyright © 2022 SNN Africa | All rights Reserved.

Welcome Back!

Sign In with Facebook
Sign In with Google
Sign In with Linked In
OR

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Generated by Feedzy
This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.