Baidu's Text-To-Speech Tech 'Perfectly' Works With Many Accents

Deep Voice, a text-to-speech system developed by Baidu, can "perfectly" replicate a range of accents. The latest version of the software, known as Deep Voice 2, has learned from "hundreds of unique voices" and it only needed under 30 minutes of data from each.

Without any direct guidance, the AI-based system managed to identify similarities between the voices and teach itself to imitate accents.

Deep voice 2 can learn from hundreds of voices and imitate them perfectly,- Baidu

It can easily adapt to different accents by building an initial model of human speech based on similarities. Deep Voice 2 then tweaks that model when needed.

You can find examples of the technology working here.

Post a Comment

0 Comments