Пресс-релиз

Video game technology helps paralysed woman speak again

Edinburgh-based Speech Graphics and researchers at UC San Francisco and UC Berkeley create the world’s first brain-computer interface that synthesises speech and facial expression from brain signals, opening a way to restore natural communication for those who cannot speak.
The same software that’s used to drive facial animation in games such as The Last Of Us Part II and Hogwarts Legacy turns brain waves into a talking digital avatar.


[Edinburgh, 23rd August 2023] In a groundbreaking research study Speech Graphics, a pioneer of AI-driven facial animation, has collaborated with researchers at UC San Francisco and UC Berkeley to help a paralysed woman in the US communicate using a digital avatar controlled via a brain-computer interface (BCI).


The research was able to decode the woman's brain signals into three forms of communication: text, synthetic voice, and facial animation on a digital avatar, including lip sync and emotional expressions. This represents the first time that facial animation has been synthesized from brain signals, and a paper detailing this research breakthrough is due to appear in the August edition of the science journal Nature.


The team was led by Edward Chang MD, chair of neurological surgery at UCSF, who has spent a decade working on brain-computer interfaces. They implanted a paper-thin rectangle of 253 electrodes onto the surface of the woman’s brain over areas his team has discovered are critical for speech. The electrodes intercepted the brain signals that, if not for the stroke, would have gone to muscles in her tongue, jaw, larynx, and face. A cable, plugged into a port fixed to her head, connected the electrodes to a bank of computers, allowing AI algorithms to be trained over several weeks to recognise the brain activity associated with a vocabulary of over 1,000 words. Thanks to the AI, the woman could 'write' text, as well as ‘speak’ using a synthesised voice based on recordings of her real voice before she was paralysed.


The researchers also worked with Michael Berger, the CTO and co-founder of Speech Graphics, to decode this brain activity into facial movements. Speech Graphics' AI-based facial animation technology - more commonly used to create realistic facial animation in video games including Halo Infinite, Hogwarts Legacy and The Last of Us Part II - simulates muscle contractions over time, including speech articulations and nonverbal activity. This process is normally driven by audio input: the software analyzes the audio and reverse-engineers the complex muscle movements of the face, tongue and jaw that should have occurred while producing that sound.  In one approach, the team used the subject's synthesized voice as input to the Speech Graphics system in place of her actual voice to drive the muscles. The company’s real-time software then converted the muscle actions into 3D animation in a video game engine. The result was a realistic avatar of the subject that accurately pronounced words in sync with the synthesised voice as a result of her efforts to communicate.


However, in a second approach that is even more groundbreaking, the signals from the subject's brain were meshed directly with the simulated muscles, allowing them to serve as an analog to the subject's non-functioning muscles. She could also cause the avatar to express specific emotions and move individual muscles.


“Creating a digital avatar that can speak, emote and articulate in real-time, connected directly to the subject’s brain, shows the potential for AI-driven faces well beyond video games,” said Michael Berger, CTO and co-founder of Speech Graphics. “When we speak, it’s a complex combination of audio and visual cues that helps us express how we feel and what we have to say. Restoring voice alone is impressive, but facial communication is so intrinsic to being human, and it restores a sense of embodiment and control to the patient who has lost that. I hope that the work we’ve done in conjunction with Professor Chang can go on to help many more people.”


“We’re making up for the connections between the brain and vocal tract that have been severed by the stroke,” said Kaylo Littlejohn, a graduate student working with Chang and Gopala Anumanchipalli, PhD, a professor of electrical engineering and computer sciences at UC Berkeley. “When the subject first used this system to speak and move the avatar’s face in tandem, I knew that this was going to be something that would have a real impact.”


“Our goal is to restore a full, embodied way of communicating, which is really the most natural way for us to talk with others,” said Professor Chang, who is chair of neurological surgery at UCSF and a member of the UCSF Weill Institute for Neuroscience. “These advancements bring us much closer to making this a real solution for patients.”


The team hopes it will lead to an FDA-approved system that enables speech from brain signals in the near future.


- Ends  -

 
About Speech Graphics


Speech Graphics delivers pioneering AI-driven facial animation technology to the entertainment industry, working with clients such as Warner Brothers, Epic Games, Techland, Crystal Dynamics, Xbox Game Studios, Naughty Dog and more.


The engine's core technology is based on 20+ years of scientific research in linguistics, biomechanics, psychology, machine learning, and computer graphics under the leadership of founders Michael Berger and Gregor Hofer. The business has won several awards for its speech-driven animation technology, including the 2022 TIGA for Best Tools, Technology & Innovation. The software produces high-quality facial animation from audio alone, with no need for motion capture. With offices in Edinburgh, San Francisco, Budapest and Singapore, Speech Graphics is a trusted global partner in audio-driven facial animation used by 90% of AAA video game publishers.


Speech Graphics also has an enterprise offering, called Rapport, creating more natural interactions between people and machines. Find out more at www.rapport.cloud

Follow Speech Graphics
www.speech-graphics.com | @speechgraphics | YouTube.com/Speech Graphics | LinkedIn/Speech Graphics

About Michael Berger
Michael is a linguist, speech scientist and software engineer who has been a researcher and inventor in the field of speech animation since 1995. He helped develop the first-ever automated talking head and co-authored a book on the subject, later devising novel AI-based techniques in realistic synthesis of facial dynamics from audio. He founded a speech animation research business in the US, and co-founded Speech Graphics Ltd in Edinburgh, Scotland in 2010, where he serves as Chief Technology Officer and leads the company's research, engineering and product development.

LinkedIn/Michael Berger

About UCSF
The University of California, San Francisco (UCSF) is exclusively focused on the health sciences and is dedicated to promoting health worldwide through advanced biomedical research, graduate-level education in the life sciences and health professions, and excellence in patient care. UCSF Health, which serves as UCSF’s primary academic medical center, includes top-ranked specialty hospitals and other clinical programs, and has affiliations throughout the Bay Area. UCSF School of Medicine also has a regional campus in Fresno. Learn more at ucsf.edu or see our Fact Sheet.

Follow UCSF
ucsf.edu | Facebook.com/ucsf | Twitter.com/ucsf | YouTube.com/ucsf
 
Authors: Sean L. Metzger, Kaylo T. Littlejohn, Alexander B. Silva, David A. Moses, Margaret P. Seaton, Ran Wang, Maximilian E. Dougherty, Jessie R. Liu, Peter Wu, Michael A. Berger, Inga Zhuravleva, Adelyn Tu-Chan, Karunesh Ganguly, Gopala K. Anumanchipalli, Edward F. Chang.

Funding: This research was supported by the National Institutes of Health (NINDS 5U01DC018671, T32GM007618), the National Science Foundation, and philanthropy.
###