Google’s DeepMind, which has been developing intelligent computers has created a way for AI to mimic human speech.
DeepMind, which was acquired by Google in 2014, developed an AI called WaveNet that has the ability to mimic human speech by learning to create individual sound waves, according to its blog post. The company conducted blind tests using English (U.S) and Mandarin (Chinese) on human listeners that found WaveNet sounded more natural than previous or other technologies.
Current speech programs use recordings from a single human speaker then use the recordings to allow the program to speak, which is why it doesn’t sound so natural. With WaveNet its different because it doesn’t necessarily rely on humans to record every single word. WaveNet is a neural network AI that is designed to mimic parts of the human brain function, however it requires large data sets.
According to the blog post, the audio signal has to be sampled 16,000 times per second or more then it has to form predictions on the sample about what the sound wave should look like from other samples, which is challenging.
This is an interesting development in AI human speech. Since many companies have developed speech capable AI such as Microsoft with Cortana, Apple with Siri, and Amazons Alexa, having an AI capable of producing more naturally sounding speech would greatly benefit the users. Humans are also interacting more with AI, either if its using Siri, Cortana, Google Now, or Alexa, AI have steadily been increasing their interaction with humans.