Microsoft's AI tool can turn photos into realistic videos of people talking and singing

Microsoft Research Asia has unveiled a new experimental AI tool called VASA-1 that can take a still image of a person — or the drawing of one — and an existing audio file to create a lifelike talking face out of them in real time. It has the ability to generate facial expressions and head motions for an existing still image and the appropriate lip movements to match a speech or a song. The researchers uploaded a ton of examples on the project page, and the results look good enough that they could fool people into thinking that they're real.

While the lip and head motions in the examples could still look a bit robotic and out of sync upon closer inspection, it's still clear that the technology could be misused to easily and quickly create deepfake videos of real people. The researchers themselves are aware of that potential and have decided not to release "an online demo, API, product, additional implementation details, or any related offerings" until they're sure that their technology "will be used responsibly and in accordance with proper regulations." They didn't, however, say whether they're planning to implement certain safeguards to prevent bad actors from using them for nefarious purposes, such as to create deepfake porn or misinformation campaigns.

The researchers believe their technology has a ton of benefits despite its potential for misuse. They said it can be used to enhance educational equity, as well as to improve accessibility for those with communication challenges, perhaps by giving them access to an avatar that can communicate for them. It can also provide companionship and therapeutic support for those who need it, they said, insinuating the VASA-1 could be used in programs that offer access to AI characters people can talk to.

According to the paper published with the announcement, VASA-1 was trained on the VoxCeleb2 Dataset, which contains "over 1 million utterances for 6,112 celebrities" that were extracted from YouTube videos. Even though the tool was trained on real faces, it also works on artistic photos like the Mona Lisa, which the researchers amusingly combined with an audio file of Anne Hathaway's viral rendition of Lil Wayne's Paparazzi. It's so delightful, it's worth a watch, even if you're doubting what good a technology like this can do.

This article originally appeared on Engadget at https://www.engadget.com/microsofts-ai-tool-can-turn-photos-into-realistic-videos-of-people-talking-and-singing-070052240.html?src=rss https://www.engadget.com/microsofts-ai-tool-can-turn-photos-into-realistic-videos-of-people-talking-and-singing-070052240.html?src=rss

Établi 1y | 20 avr. 2024, 08:40:16

Connectez-vous pour ajouter un commentaire

Autres messages de ce groupe

Anthropic reaches a settlement over authors' class-action piracy lawsuit

Anthropic has

26 août 2025, 22:10:24 | Engadget

The first known AI wrongful death lawsuit accuses OpenAI of enabling a teen's suicide

On Tuesday, the first known wrongful death lawsuit against an AI company was filed. Matt and Maria Raine, the parents of a teen who committed suicide this year, have sued OpenAI for their son's dea

26 août 2025, 22:10:22 | Engadget

The iPhone 17 'Awe dropping' event is on September 9: Here's what to expect from Apple

We're now just two weeks away from the Apple iPhone 17 event

26 août 2025, 22:10:20 | Engadget

KPop Demon Hunters is Netflix's most-watched movie of all time

Huntr/x has indeed shown us how it's done-done-done. KPop Demon Hunters is now the queen it was meant to be, taking the crown as the most-watched title on Netflix. The charming animated fi

26 août 2025, 22:10:17 | Engadget

Dyson's Labor Day sale includes the 360 Vis Nav robot vacuum for $500 off

Dyson is holding a Labor Day sale right now, with discounts

26 août 2025, 19:41:10 | Engadget

German court rules Apple cannot call its smartwatch 'carbon neutral'

Apple has made some pretty big environmental claims over the years, and one of the more eyebrow-raising ones was that select models of its Apple Watch Series 9 were "carbon neutral." The statement

26 août 2025, 19:41:06 | Engadget

Whistleblower claims DOGE uploaded Social Security data to unsecure cloud server

The Social Security Administration’s (SSA) chief data officer, Charles Borges, has filed a whistleblower com

26 août 2025, 19:41:02 | Engadget

Techie