Embedded-Computing.com

Sound check: Testing uno, zwei, trois...
Q & A with Arie Deutsch, Voxonic

by Unknown, 10.09.06

Editor’s note: As the world becomes more and more connected, the need to communicate in more than one language intensifies. New York-based Voxonic, Inc. has developed patent-pending software that seamlessly translates speech into foreign languages, replicating the speaker’s actual voice. With this technology, moviegoers in France could watch Darth Vader confess, “Luc, je suis votre père,” and teenagers in Japan could hear Eminem rap in Japanese. Considering the implications this has for the consumer electronics market, we asked Arie Deutsch, Voxonic cofounder and president of entertainment, to share how Voxonic is deploying this technology. – Jennifer Hesse

 

ECD:  How does this software work?

DEUTSCH:  Voxonic is a technology that has the ability to recreate voices in any language. So what we do is take a 10-minute voice sample of the specific voice we want to recreate, called the target voice. That 10-minute sample taken in a vocal booth in a pristine sound environment gives us their voice sample, which includes the tones they make, the way they pronounce words, and all the sounds they make within the language they speak.

After we have that, we then take the source speaker that will say the words we want in the target voice. They say the exact same words the target spoke during the 10 minutes. From that, our system takes a voiceprint and uses the words spoken by the source and lays the voiceprint of the target on top of the source’s words.

ECD: I noticed on your website that your program can work jointly with a Text-To-Speech (TTS) engine. What’s that and how does it work?

DEUTSCH: Voxonic is in the process of developing our own manual markup TTS system where you can type words into the computer and it will come out as a desired person’s voice and not a generic computer voice. We can put a voice sample into our system and create short utterances that don’t have that much emotion in them, like voicemail prompts, short messages – things that do not have active involvement,  like greetings. TTS users will be able to take the sample from the target voice, type it in the computer, and it will come out in the target’s voice.

ECD: Explain the history of how this software was developed.

DEUTSCH:  In 1998, Fred Deutsch, CEO and founder of the company, decided to create this technology where we could have foreign movies in the voice of the original actor. For example, Tom Cruise in “Mission Impossible,” or Harrison Ford in his movies … any of the famous stars. He thought it would be valuable for actors to speak a foreign language in their own voices. So he went and found engineers to develop this technology now called Voxonic.   We spent about five years researching and developing the software and commercialized it in late 2004.

We have software developers constantly working on upgrading, which will never be finished, in our opinion. We will always be improving, always trying to get the next thing whether that be TTS or another form of instantaneous translation. Whatever it is, we want to stay on top of the industry and how to make the software faster and more reliable.

ECD: How does this improve on any technology that was already available?

DEUTSCH: We take dubbing to another level. We do all the same things dubbing does, in terms of casting the voice and having actors say the foreign lines. But unlike dubbing, we take that voice sample and put it through the Voxonic software, and boom! It comes out in the desired voice.

ECD: What types of applications is Voxonic being used in right now?

DEUTSCH: Basically the applications are limitless. We’re working with the music industry to take American rappers, translate their lyrics, and transform their voices into a foreign language to give them more exposure in other parts of the world.

Also, corporate communications is a big application for us. We just did a piece with Berlitz, which is a language company. They used Voxonic to do a 15-minute electronic press kit explaining to all their clients around the world what they do and how they can help them more effectively. We took one voice and turned that voice into five different languages. Voxonic is appealing to the world because it embraces the fact there are so many people out there that work for companies in different territories.

ECD: What about cell phone greetings and that kind of thing?

DEUTSCH: Absolutely, we can do customized voicemail greetings for people. They can have a celebrity answer their phone and say their name. It’s important to note, though, that Voxonic doesn’t do anything without the rights of the voice.

ECD: How do you think this technology will change the consumer electronic market, especially considering the wide range of application possibilities?

DEUTSCH: I think it will brand the voice further. A lot of endorsement is done with commercials and pictures. Now, with Voxonic, the spokesman can be a worldwide spokesman. Take Tiger Woods, who is a big Nike endorser. He doesn’t speak French, so if they wanted to promote Tiger in France right now, they would have to use print. Using Voxonic, we could take that endorsement and put Tiger’s voice into French.

In terms of consumer products, take video games. Video games are shipped with a set amount of audio in them. Voxonic can allow audio to be constantly updated without having to call the voice back into the studio. So with a video game like Madden ’07, Madden only has to actually go in there once and say all the lines in the game. Voxonic with a 10-minute sample can create all the same things and more so the audio never gets stale.

ECD: What other functions or features are your developers working on?

DEUTSCH: We are in R&D doing songs right now and are working on other applications that are more automated-response solutions, such as a smart phone. I can’t be very specific about it, but basically, it would be an interactive phone that would talk back to you in a desired person’s voice.

The biggest thing we are working on now is our TTS engine, which really will make everything go quicker and bring this product to the masses. Kids at home will possibly be able to use it with Instant Message, so when they’re IMing a child in foreign places, they could type their message in English and it would come out in their voice.

ECD: Are there any other significant projects you’re working on now or major stars that have released records using this technology? 

DEUTSCH: We don’t disclose that type of information as far as celebrity names, but we are working in the music industry as we speak. We’re also talking to studios about doing movie projects and talking about doing celebrity endorsements and stuff like that.

 

Arie Deutsch is cofounder and president of the entertainment division at Voxonic, Inc., where he oversees all new business projects for the company. Prior to Voxonic, he worked as a music agent for artists such as James Brown, Jagged Edge, Doug E. Fresh, and Rob Base. A George Washington University alumnus, Arie continues to develop new applications for Voxonic and meets with his music and entertainment contacts to secure deals for the company. For more information, visit http://www.voxonic.com/.