Google DeepMind's new AI tech will generate soundtracks for videos

Google's DeepMind artificial intelligence laboratory is working on a new technology that can generate soundtracks, even dialogue, to go along with videos. The lab has shared its progress on the video-to-audio (V2A) technology project, which can be paired with Google Veo and other video creation tools like OpenAI's Sora. In its blog post, the DeepMind team explains that the system can understand raw pixels and combine that information with text prompts to create sound effects for what's happening onscreen. To note, the tool can also be used to make soundtracks for traditional footage, such as silent films and any other video without sound. 

DeepMind's researchers trained the technology on videos, audios and AI-generated annotations that contain detailed descriptions of sounds and dialogue transcripts. They said that by doing so, the technology learned to associate specific sounds with visual scenes. As TechCrunch notes, DeepMind's team isn't the first to release an AI tool that can generate sound effects — ElevenLabs released one recently, as well — and it won't be the last. "Our research stands out from existing video-to-audio solutions because it can understand raw pixels and adding a text prompt is optional," the team writes.

While the text prompt is optional, it can be used to shape and refine the final product so that it's as accurate and as realistic as possible. You can enter positive prompts to steer the output towards creating sounds you want, for instance, or negative prompts to steer it away from the sounds you don't want. In the sample below, the team used the prompt: "Cinematic, thriller, horror film, music, tension, ambience, footsteps on concrete.

The researchers admit that they're still trying to address their V2A technology's existing limitations, like the drop in the output's audio quality that can happen if there are distortions in the source video. They're also still working on improving lip synchronizations for generated dialogue. In addition, they vow to put the technology through "rigorous safety assessments and testing" before releasing it to the world. 

This article originally appeared on Engadget at https://www.engadget.com/google-deepminds-new-ai-tech-will-generate-soundtracks-for-videos-113100908.html?src=rss https://www.engadget.com/google-deepminds-new-ai-tech-will-generate-soundtracks-for-videos-113100908.html?src=rss
Created 1y | Jun 18, 2024, 1:10:10 PM


Login to add comment

Other posts in this group

It's the last day to get two months of Paramount+ access for only $2

Another hot streaming deal is here to match the summer heat. This time is comes fro

Jul 13, 2025, 11:40:04 AM | Engadget
The Cult of the Lamb comic is coming back with the Schism Special this fall

We're officially getting more of the Cult of the Lamb comic expansion. Following last year's miniseries, which built on the game's existing lore and injected some real emotional depth, wri

Jul 12, 2025, 9:40:13 PM | Engadget
Grok team apologizes for the chatbot's 'horrific behavior' and blames 'MechaHitler' on a bad update

The team behind Grok has issued a rare apology and explanation of what went wrong after X's chatbot began

Jul 12, 2025, 7:30:05 PM | Engadget
Nintendo reportedly bans Switch 2 user playing preowned game cards

You might have to be extra careful who you buy your used Nintendo Switch game cards from if you don't want to get mistakenly banned. A Nintendo Switch 2 owner

Jul 12, 2025, 7:30:04 PM | Engadget
Meta reportedly closes deal to buy AI voice replicator PlayAI

Meta has finalized the agreement to purchase Play AI

Jul 12, 2025, 5:10:12 PM | Engadget
This HDMI mod lets you play Nintendo Switch Lite on a big screen

If you can't get your hands on the latest

Jul 12, 2025, 5:10:11 PM | Engadget
The best Prime Day Apple deals on iPads, AirPods, MacBooks and more still available today

There’s a reason Apple gear is so in demand. After reviewing nearly every major device out there, our current favorite

Jul 12, 2025, 2:40:19 PM | Engadget