Jan 30, 2019
A team of researchers at Columbia University has developed a speech brain-computer interface system that translates brain signals into intelligible, recognizable speech. By monitoring someone’s brain activity, the system can reconstruct the words a person hears with unprecedented clarity. The breakthrough , reported in the journal Scientific Reports, could lead to new ways for computers to communicate directly with the brain, and lays the groundwork for helping people who cannot speak.
Advance marks critical step toward brain-computer interface systems that hold immense promise for those with limited or no ability to speak. Image credit: Kai Kalhh.
“Our voices help connect us to our friends, family and the world around us, which is why losing the power of one’s voice due to injury or disease is so devastating. With today’s study, we have a potential way to restore that power. We’ve shown that, with the right technology, these people’s thoughts could be decoded and understood by any listener,” said senior author Dr. Nima Mesgarani, principal investigator in the Mortimer B. Zuckerman Mind Brain Behavior Institute at Columbia University.
Early efforts to decode brain signals by Dr. Mesgarani and colleagues focused on simple computer models that analyzed spectrograms, which are visual representations of sound frequencies.
But because this approach has failed to produce anything resembling intelligible speech, the team turned instead to a vocoder, a computer algorithm that can synthesize speech after being trained on recordings of people talking.
“This is the same technology used by Amazon Echo and Apple Siri to give verbal responses to our questions,” Dr. Mesgarani said.
“We asked epilepsy patients already undergoing brain surgery to listen to sentences spoken by different people, while we measured patterns of brain activity. These neural patterns trained the vocoder.”
Next, the researchers asked those same patients to listen to speakers reciting digits between 0 to 9, while recording brain signals that could then be run through the vocoder.
The sound produced by the vocoder in response to those signals was analyzed and cleaned up by neural networks, a type of artificial intelligence that mimics the structure of neurons in the biological brain. The end result was a robotic-sounding voice reciting a sequence of numbers.
To test the accuracy of the recording, the scientists tasked individuals to listen to the recording and report what they heard.
“We found that people could understand and repeat the sounds about 75% of the time, which is well above and beyond any previous attempts,” Dr. Mesgarani said.
“The improvement in intelligibility was especially evident when comparing the new recordings to the earlier, spectrogram-based attempts. The sensitive vocoder and powerful neural networks represented the sounds the patients had originally listened to with surprising accuracy.”
The study authors now plan to test more complicated words and sentences next, and they want to run the same tests on brain signals emitted when a person speaks or imagines speaking.
Ultimately, they hope their system could be part of an implant, similar to those worn by some epilepsy patients, that translates the wearer’s thoughts directly into words.
Hassan Akbari et al. 2019. Towards reconstructing intelligible speech from the human auditory cortex. Scientific Reports 9, article number: 874; doi: 10.1038/s41598-018-37359-z
Thanks to: http://www.sci-news.com