Ryan. Alex. Siri. These are names of robotic voices that are often programmed on electronic tablets. They also become default identities for people with speech disorders who rely on technology to communicate.
Now some speech scientists are developing customized voices to reflect the broader diversity of the people who use them. To do it, they are tapping into a vast network of volunteers who are donating their voices to share with people who cannot speak.
The effort to build an international “Human Voicebank” has attracted more than 17,000 volunteers from 110 countries, including Priyanka Pandya, a 16-year old from Columbia, Md., who plans to spend her winter vacation recording a string of sentences into her laptop.
“To be able to give somebody the gift of voice,” said the junior at Glenelg Country School. “I think that’s really, really powerful,” she said.
Her voice could be used to help one of more than 2 million Americans who have severe speech disorders and need help to communicate, a dilemma captured in the acclaimed new television series “Speechless,” whose main character, a teenager with cerebral palsy, relies on technology and a personal aide to act as his “voice” at school.
“Everyone has a voice,” said Rupal Patel, founder of VocaliD, the Belmont-Mass.-based start-up that launched the voice bank. “Even people who are speechless have sounds that are unique to them.”
Her company designs personalized synthetic voices by recording the unique, if limited, sounds of the user, and then blending them with a larger sample – usually six to ten hours of recordings – from a voice donor, matched by age, gender and region.
The company is developing voices now for its first 100 customers. But researchers have been honing the technology for many years.
Tim Bunnell, head of the Speech Research Laboratory at the Alfred I. DuPont Hospital for Children in Wilmington, Delaware, has engineered voices for more than 1,000 people with degenerative diseases such as amyotrophic lateral sclerosis, or ALS. Those people were usually able to record their own voices before they lost the ability to talk. He said it’s much more challenging to extrapolate what a voice should sound like for someone who can only make a few vowel sounds.
The scientists in Bunnell’s lab take the sounds and analyze them for vocal qualities, such as pitch and timbre, and record them as a batch of numbers. Then they map out the recordings from a voice donor. They merge the voices by modifying the donor voice to reflect the qualities of the user’s voice.
Bunnell said, in particular, they work to match the “vowel quality” because the “color” of someone’s voice is primarily conveyed through the vowels. It’s a process of tweaking the numbers and trial and error.
Patel, who is also a speech technology professor on leave from Northeastern University, worked with Bunnell as a researcher. When she decided to bring the technology out of the lab, she developed her own method for blending voices and turned to crowdsourcing to gain access to enough voices to reflect the diversity of potential customers, which could include a middle-aged man from Alabama or a little girl from Ireland.
Since she announced the voicebank two years ago, the response has been overwhelming, she said, with donors ages 6 to 91 logging on from all over the world. People donate for many reasons: Some are practicing English or working toward community service requirements. Some have throat cancer or degenerative diseases such as ALS and could eventually become voice recipients.
Some donors said they are undergoing gender transition and want to preserve their voice before it changes. Many are simply interested in a new way to volunteer.
Priyanka Pandya has gone on mission trips to South America, and she started a service organization at her school. But the aspiring biomedical engineer said she was inspired to help someone speak and also fascinated by the technology and the ability to “intertwine two voices.”
She began recording her voice last summer – using a pair of headphones and speaking into her MacBook Air as she read the prompts on her computer screen. Then she recruited family friends and members of her Girl Scout troop to do the same.
The emerging technology has challenges. Patel said she is continuing to streamline the process to build voices cost-effectively and efficiently. The $1,249 price tag puts it out of reach for many people. And for donors, the process is lengthy; It can take six to 10 hours to record the 3,500 phrases required to complete a voice sample. Only about 10 percent of the donors have made it all the way through.
Also: People’s voices change. The company is looking for donors who are willing to record their voices, and then record them again a few years later, as they get older. But some of the first customers say they are happy with the results.
John A. Gregoire, of Windham, Maine, was one of the first to receive a customized voice last December from VocalID, then in a pilot phase. The voice came eight years after his diagnosis of ALS, and more than six years since his voice had become unintelligible to everyone except his wife and youngest son.
Since then, his wife, Linda, said she got used to hearing him speak with “Ryan,” the American-sounding male voice programmed into his iPad. When they heard his customized voice, they “reacted simultaneously,” Linda said. “It was him.”
The couple is raising money now to build a sound studio to encourage people in their community to make high-quality recordings of their voices to donate to others.
“Having a distinctive voice is like getting something back that was stolen,” John said.
(c) 2016, The Washington Post · Michael Alison Chandler