Generating speech with totally different rhythms and pauses makes it sound extra human-like, based on an evaluation of a synthetic intelligence skilled on speech taken from YouTube and podcasts.
Most synthetic intelligence text-to-speech programs are skilled on information units of acted speech, which may result in the output sounding stilted and one-dimensional. More pure speech typically shows a variety of rhythms and patterns to convey totally different meanings and feelings.
Now, Alexander Rudnicky at Carnegie Mellon University in Pittsburgh, Pennsylvania, …
Source: www.newscientist.com