03 November 2022

Neurointerface on the way to a record

Neuroimplants convert brain waves into words

Cloud4Y blog, Habr

The author of the original: Edward Chang

"Do you want some water?" — a question appears on the screen.

Three dots blink, and then the words pop up one by one:

"No, I'm not thirsty."

These words are generated by the activity of the brain of a person who has not spoken for more than 15 years. The stroke damaged the connection between the brain and the rest of the body. 

This man tried to "talk" in a variety of ways. For example, most recently he used a pointer attached to his baseball cap to "type" words on a touchscreen. An effective, but too slow method. He wanted to find something faster, so he agreed to participate in the clinical trials of a research group at the University of California, San Francisco. Their goal is to turn the tested technology into a household tool that other people who are deprived of the ability to speak can use.

In a pilot study, scientists placed a thin flexible electrode array on the surface of a volunteer's brain. The electrodes recorded neural signals and sent them to a speech decoder, which translated the signals into words. This was the first time that a paralyzed person who could not speak used neurotechnology to transmit whole words, not just letters.

How neuroprostheses work

brain-computer1.jpg

The first version of the brain-computer interface gave the volunteer a vocabulary of 50 frequently used words

Neuroprosthetics has come a long way over the past two decades. Prosthetic implants for hearing have advanced the furthest, their designs interact with the cochlear nerve of the inner ear or provoking an auditory reaction of the brain stem. Research is also being conducted on retinal implants and brain implants to restore vision, attempts are being made to give people with prosthetic hands a sense of touch. All these sensory prostheses take information from the outside world and convert it into electrical signals that are sent to the information processing centers in the brain.

The opposite type of neuroprosthesis records the electrical activity of the brain and converts it into signals that control something in the outside world, such as a robotic arm, a video game controller, or a cursor on a computer screen. This last control method allowed paralyzed people to type words — letter by letter, but using the autofill function to speed up the process.

To print using brain signals, the implant is usually placed in the motor cortex, which is the part of the brain that controls movement. The user then presents certain physical actions to control the cursor moving across the virtual keyboard. Another approach, first proposed in a 2021 paper (Moses et al., Neurosthesis for Decoding Speech in a Paralyzed Person with Anarthria/NEJM), was that the user imagined writing letters by hand on a piece of paper. This generated signals in the motor cortex, which were converted into text. This approach set a new speed record, allowing the volunteer to write about 18 words per minute.

The University of California research team has chosen a more ambitious approach. Instead of reading the user's intention to move the cursor or pen, they began to look for an opportunity to control the voice device, tongue and lips.

Muscles involved in speech

Speech is one of the behaviors that distinguishes humans from other species.

Many animals make sounds, but only humans combine a set of sounds in countless different ways to represent the world around them. It is also an extremely complex motor act — some experts believe that this is generally the most complex motor action that a person performs. Speech is the result of the passage of a modulated air flow through the vocal tract. With each sound, we form breathing in a certain way, creating audible vibrations in the vocal cords of the larynx and changing the shape of the lips, jaw and tongue.

Many muscles of the vocal tract are completely different from the muscles of the joints (for example, arms and legs), which can only move in several prescribed ways. For example, the muscle controlling the lips is a squeezer, and the muscles of the tongue are controlled to a greater extent by hydraulics — the tongue consists of a fixed volume of muscle tissue, so the movement of one part of the tongue changes its shape in another place. The physics governing the movements of such muscles is completely different from the physics of biceps or hamstrings.

Because so many muscles are involved, and each of them has so many degrees of freedom, there are, in fact, an infinite number of possible configurations. But when people speak, it turns out that they use a relatively small set of basic movements (which vary somewhat in different languages). For example, when native English speakers pronounce the sound "d", they put their tongue behind their teeth. When they pronounce the "k" sound, the root of their tongues rises to touch the upper palate. Few people are aware of the precise, complex and coordinated muscle movements necessary to pronounce the simplest word.

The research team is studying the parts of the motor cortex of the brain that send commands for the movements of the muscles of the face, throat, mouth and tongue. These areas of the brain perform several tasks: they control the movements of the muscles that reproduce speech, as well as the movements of the same muscles that are responsible for swallowing, smiling and kissing.

Studying the neural activity of these regions requires both spatial resolution on the millimeter scale and temporal resolution on the millisecond scale. Historically, noninvasive imaging systems could provide either one or the other, but not both at the same time. At the beginning of the study, scientists found very little data on how brain activity patterns are associated with even the simplest components of speech: phonemes and syllables.

Volunteers helped. At the University of California Epilepsy Center, electrodes are surgically applied to the surface of the brain for patients preparing for surgery. They last for several days so that researchers can map the areas involved during seizures. During these days of enforced inactivity, many patients voluntarily participate in neurological research experiments where recordings from electrodes are used. The researchers, with the permission of the patients, studied their patterns of neural activity during the utterance of words.

The hardware that is used in this case is called electrocorticography (ECoG). The electrodes in the ECoG system do not penetrate into the brain, but lie on its surface. Arrays can contain several hundred sensors-electrodes, each of which records information from thousands of neurons. Arrays with 256 channels have been used so far.

The researchers first looked for patterns of cortical activity when people pronounce simple syllables. The volunteers were asked to pronounce certain sounds and words while their neural patterns were recorded and tongue and mouth movements recorded. Sometimes volunteers were asked to wear colored face paint and used a computer vision system to extract kinematic gestures; in other cases, an ultrasound machine located under the patients' jaws was used to get an image of their moving tongues.

brain-computer2.jpg

A flexible matrix of electrodes is superimposed on the patient's brain and picks up signals from the motor cortex. The array records movement commands directed to the patient's vocal tract. A port attached to the skull directs wires to a computer system that decodes brain signals and translates them into words that the patient wants to say. Then his answers appear on the display screen.

These systems were used to match neural patterns with the movements of the vocal tract. At first, there were a lot of questions about the neural code. One possibility was that neural activity encoded directions for certain muscles, and the brain essentially turned those muscles on and off, as if it were pressing keys on a keyboard. Another idea was that the code determines the speed of muscle contractions. Another possibility is that neural activity corresponded to coordinated patterns of muscle contractions used to reproduce a particular sound. (For example, to pronounce the sound "aaa", the tongue and jaw must drop.) It turned out that there is a map of representations that controls different parts of the speech tract, and that different areas of the brain are coordinated so that a person can speak fluently.

The role of AI in modern neurotechnology

The data collected during the study on neural activity and speech kinematics are transmitted to the neural network, and then the machine learning algorithm looks for patterns between the two data sets. You can establish a link between neural activity and reproducible speech, and then use this model to create computer speech or text. But it was not possible to train a neural network for paralyzed people, because half of the data was missing: there were neural patterns, but there was no information about the corresponding muscle movements.

A more reasonable way to use machine learning seemed to be the ability to split the task into two stages. First, the decoder translates signals from the brain into the intended movements of the muscles of the vocal tract, and then translates these intended movements into synthesized speech or text.

The researchers called this a biomimetic approach because it copies biological processes. In the human body, neural activity is directly responsible for the movements of the vocal tract and only indirectly for the sounds produced. The great advantage of this approach is to train the decoder in the second stage of converting muscle movements into sounds. Since the relationship between the movements of the vocal tract and sound is quite universal, it turned out to train the decoder on large data sets received from people who were not paralyzed.

Clinical trials of speech neuroprosthesis

The next big task was to bring the technology to people who could really benefit from it.

The National Institutes of Health (NIH) is funding a pilot trial that began in 2021. There are already two paralyzed volunteers with implanted ECoG arrays, and there should be more of them in the near future. The main goal is to improve their communication, and productivity is measured by the number of words per minute. The average adult typing on a standard keyboard can type 40 words per minute, and the fastest typists can type more than 80 words per minute.

Using a speech system can improve results. Human speech is much faster than typing: a native English speaker can easily pronounce 150 words per minute. The aim of the study is to help paralyzed people communicate at a rate of 100 words per minute. 

The implantation procedure is standard. The surgeon removes a small part of the skull and gently places a flexible array of ECoG on the surface of the cerebral cortex. Then a small port is fixed on the skull bone and exits through a hole in the scalp. Now this port is needed to connect to external wires for transmitting data from electrodes, but in the future it is planned to make this system wireless.

The researchers considered using penetrating microelectrodes because they can record data from small populations of neurons and, therefore, can provide more detailed information about neural activity. But penetrating microelectrodes are not yet as reliable and safe as ECoG. In addition, to convert neural signals into clear commands, penetrating electrodes usually require daily recalibration. At the same time, the speed of setup and reliability of operation are key factors for the applicability of neural devices. That's why when creating a system for long-term use, stability is a priority. 

A study examining the variability of a volunteer's neural signals over time has shown that a decoder works better if it uses data patterns over several sessions and several days. 

The scientists asked their first volunteer to try two different approaches. He started with a list of 50 words for everyday life, such as "hungry," "thirsty," "please," "help," and "computer." During 48 sessions, he was sometimes asked to just imagine that he was pronouncing every word from the list, and sometimes asked to try to pronounce them. It turned out that attempts to speak generated clearer brain signals, and they were enough to train the decoding algorithm. Then the volunteer could use these words from the list to compose sentences, for example: "No, I don't want to drink."

Now researchers are trying to expand the vocabulary. To do this, we need to further improve the current algorithms and interfaces, which is likely to happen in the coming months and years. Now that the proof of principle is established, the goal is optimization. You can focus on making the system faster, more accurate and, most importantly, safer and more reliable. 

Probably the biggest breakthroughs will occur when it is possible to understand brain systems and how paralysis changes their activity. Researchers have already found out that the neural patterns of a paralyzed person who cannot send commands to the muscles of the vocal tract are very different from those of an epileptic patient who can do this. There is still a lot to learn and understand, but the researchers are confident that they will be able to return their lost voices to their patients.

Portal "Eternal youth" http://vechnayamolodost.ru


Found a typo? Select it and press ctrl + enter Print version