55 years after the assassination of John F Kennedy, engineers at Edinburgh based company CereProc have used new technology to recreate his voice allowing us to hear the speech the president was due to give on that fatal day in Dallas.
After his unexpected and tragic death at the age of 46, on November 22nd 1963, the text of Kennedy’s intended speech was preserved for posterity and was used by researchers to create this 22-minute-long audio clip.
Alan Kelly, Executive Creative Director at ROTHCO who came up with the concept of JFK Unsilenced said: “I was watching a documentary about the president’s Dallas trip and I had never really thought about where he was on his way to when he was shot. I hadn’t heard about the speech and I didn’t know it existed.”
“It had obviously been written in advance, but I don’t think it had registered with many people. I looked it up online and was blown away by how prescient it is to today. By bringing his voice back to life to deliver this speech, the message is even more powerful.”
Recreating Kennedy’s Lost Speech
Commissioned by ROTHCO, the project was completed in partnership with CereProc and The Times newspaper. CereProc spent eight weeks painstakingly recreating the 2,590 words of Kennedy’s undelivered speech, which he had planned to give at the Dallas Trade Mart that day. Using machine learning and AI techniques the team were able to meticulously map Kennedy’s unique cadence and voice. By breaking down his existing speeches and material the team could then stitch the content back together to create an entirely new speech.
Chris Pidcock, Co-founder and Chief Voice Engineer at CereProc explained: “There are only 40/45 phonemes in English so once you’ve got that set you can generate any word in the English language. The problem is that it would not sound natural because one sound merges into the sound next to it so they’re not really independent. You really need the sounds in the context of every other sound and that makes the database big.”
“Trying to harmonise the environment and manipulate the audio so that it ran together was quite difficult, Getting to that point is pretty challenging based on the variable audio quality, as well as the speech itself having different qualities and noise levels. One of the things we needed to do is get a very accurate transcription of the audio so that things like ‘umms’ and ‘ehs’ could be labelled and we could make sure the phonetic pieces we got were correct.”
“If you label them incorrectly you might pick the wrong piece and the whole sentence will sound wrong. We’ve been working on machine- learning models with deep neural networks to help predict the way that his intonation works so that we could build a more accurate predictor of where his speech would go and where he’d put his emphasis. We used that to improve the way that we could generate the speech output for Kennedy. That was a new thing that we had been testing but we hadn’t used in a real project before.”
Practical Applications for the Future
This model could soon be used in CereProc’s cloning tool to help people who are losing their voice due to illnesses such as motor neurone disease. The tool will allow individuals who are at risk of losing their ability to speak, to clone their voice, requiring only three to four hours of data to run clearly. They will not have to rely on synthetic computer generated voices, they can maintain more of their own unique personality.