Just a few days ago, I wrote about the National University of Singapore’s AI model, which is based on stable propagation, and which allows brain wave activity to be translated into images and videos.
But the same mechanism can be used to interpret brain waves as speech in patients who have lost the ability due to brain damage.
This is what makes generative AI so valuable: its ability to recognize patterns from hundreds, thousands, or even millions of data points, and translate them into the desired output.
It’s a lot like learning languages, but it’s much faster and with a lot more meaning than just verbal communication.
In such a case, an American woman named Anne She regained her ability to communicate With speech and facial expressions using an AI avatar, via a brain implant and an AI model that translates her brain activity.
The system — developed by researchers at the University of California, San Francisco, and the University of California, Berkeley — allows her to speak at a rate of about 62 words per minute, which is about 40 percent of normal speech, with a (current) error rate of 23.8 percent:
Ann, a Canadian math teacher, suffered a stroke in 2005 when she was just 30, married just two years ago, and has a 13-month-old daughter and an eight-year-old stepson.
Overnight, everything was taken from me.
After several years of treatment, fine motor skills in her face were restored, allowing her to laugh or cry, and to write slowly on the screen using precise movements of her head.
“Locked-down syndrome, or LIS, is exactly what it sounds like,” she wrote. “You are fully aware, you have full sense, all five senses function, but you are locked inside a body where no muscles work. I learned to breathe on my own again, now I have full movement in my neck, I laugh again, I can cry and read, and as the years go by my smile returns, She was able to wink and say a few words.
A live brain implant that quickly interprets her brain activity and synthesizes speech through a digital avatar that picks up not only words, but facial expressions and body movements, would be a huge improvement in her ability to communicate with people and independently use devices like smartphones or computers.
“When I was in rehab, the speech therapist didn’t know what to do with me” He said that. “Being part of this study has given me a sense of purpose. I feel like I’m contributing to society. It’s like I got a job again. It’s amazing that I lived this long. This study allowed me to truly live while I’m still alive!”
Translating electricity into words and movements
The decoding process required Anne to receive a sheet-thick implant of 253 electrodes, which was placed over an area identified as important for speech.
Unlike the NUS experiment, where subjects were asked to visualize certain objects and then the AI interpreted them as images and video, here, the person should try to actually speak as they normally would.
It is not about imagining the words in your head, but about sending signals to the muscles, even if they are no longer responding due to the injury. This activity is then captured by electrodes and sent to an AI model that has been pre-trained with the person.
This training is performed just like any other exercise using artificial intelligence, by repeating the input content until the algorithm begins to recognize patterns and associate them with the desired output.
In Ann’s case, I used 1,024 commonly used phrases, divided into 39 phonemes — the sounds that make up all words in the English language.
According to the researchers, focusing on phonemes rather than specific words—which are easier to pick up with repeated training than thousands of words in a dictionary—increased accuracy and speeded up the learning process by three times.
“My brain just laughs when it hears my synthesized voice,” Anne wrote in response to one question. “It’s like hearing an old friend.”
To add some body language to the interaction, the team used another AI-powered tool, developed by speech graphics, a company known for its work on character animation in video games such as The Last Of Us Part II, Hogwarts Legacy, High On Life and The Callisto Protocol. The researchers then synchronized the signals from Anne’s brain with corresponding facial movements and expressions to produce a lively, three-dimensional avatar that moved and spoke to her.
“My daughter was one year old when I was injured, and she didn’t seem to know Anne…she had no idea what Anne looked like.”
Bring people back to life
With so much of our lives becoming digital, giving people who’ve lost the ability to communicate a smooth, highly accurate solution that connects directly to their brains and doesn’t require muscle movements literally brings them back to life, and they can do so much more with computers and internet access.
One day, they could barely type with very sensitive hand tools or aids that used micro-motion recognition to write their messages slowly (like the late Stephen Hawking, for example). The next day, they are able to participate in accurate, real-time conversations.
“Giving people like Anne the ability to freely control their computers and phones using this technology will have profound implications for their independence and social interactions,” said the co-first author. David MosesPh.D., assistant professor of neurosurgery.
Like every innovation, it still requires refinement before it can become a widespread solution for people with disabilities that affect the brain. However, the jump is already so massive that it is very helpful even at this point.
It’s not a vague promise that may take a decade to see limited implementation – we could see it gaining worldwide traction in just a few short years. For thousands, if not millions, of patients around the world, it can be truly life-changing.
Featured image credit: University of California, San Francisco via YouTube