The Quest for the Perfect Lip-Synching Robot

https://www.profitableratecpm.com/f4ffsdxe?key=39b1ebce72f3758345b2155c98e6709c

Their mouths may move convincingly, but they’re far from realistic – yet

Explore

IIf we are ever going to communicate effectively with robots, they need to improve their lip syncing.

Complex mouth movements are essential for human connection, especially in noisy environments: in such environments, we look at the speaker’s mouth up to half the time.

That means they’re a key feature for robots we can chat with comfortably, but researchers have long struggled to create robots with lips that can skillfully sync to audio. Robots have mechanical constraints that limit the range of motion and speed of lip movements, for example, and they tend to lag behind commands.

ADVERTISEMENT

Nautilus members enjoy an ad-free experience. Log in or register now.

To overcome this obstacle, researchers at Columbia University in New York leveraged artificial intelligence models inspired by the human brain, known as neural networks, allowing the humanoid robot to perform fluid mouth movements that synchronize with a mixture of words.

KARAOKE ROBOT: The lip-syncing tool outperformed other systems designed for similar uses and even enabled realistic discussions in multiple languages. Video by Yuhang Hu.

“The ability to form complex lip shapes… improves overall speech synchronization with more detail, providing more realistic interactions that mitigate some of the risks of the uncanny valley effect,” according to a new study. Scientific robotics paper.

ADVERTISEMENT

Nautilus members enjoy an ad-free experience. Log in or register now.

The team designed a human-like robot face with soft silicone “skin.” It has magnetic connectors that allow 10 degrees of freedom, making all kinds of lip movements possible.

To train the models powering this robot, the team provided them with recordings of their robot performing various lip movements, such as those associated with rounded vowels. Then, they incorporated AI-generated videos of “ideal” lip movements for certain sentences into their models.

The system allows a robot’s lips to form the shapes associated with 24 consonants and 16 vowels, the researchers reported in the paper.

ADVERTISEMENT

Nautilus members enjoy an ad-free experience. Log in or register now.

Read more: “Deepfake Luke Skywalker should scare us”

Using these “ideal” AI videos as a baseline, they compared their new system to existing techniques used to shape the robots’ lip movements. Among all the methods, theirs had the least lag compared to the mouth movements of the AI ​​videos. The robot was also able to convincingly pronounce 10 different languages, including Korean, French and Arabic, with varying phonetic structures, and it even did a bit of karaoke.

There’s still plenty of room for improvement, the researchers acknowledged, including incorporating more training data and adding more physical degrees of freedom. In the future, they believe their tool could be used in education and in the care of older adults suffering from cognitive decline, as it could help us connect with robots “on a human level.”

ADVERTISEMENT

Nautilus members enjoy an ad-free experience. Log in or register now.

But they also warn that the increased emotional connection with robots could “be exploited to gain the trust of unsuspecting users, particularly children and the elderly,” and that designers should implement protective measures against these risks.

“The ability to create physical machines capable of connecting with humans on an emotional level is maturing rapidly,” the authors write. “The robots presented here are still far from natural, but one step closer to crossing the uncanny valley.”

Enjoy Nautilus? Subscribe for free to our newsletter.

ADVERTISEMENT

Nautilus members enjoy an ad-free experience. Log in or register now.

Main image: Yuhang Hu

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button