If you’ve ever watched a Star Wars show on Disney Plus, there’s a good chance you’re familiar with the work of tech startup Respeecher, whether you realize it or not. The Ukrainian company’s AI-powered voice cloning platform provided Mark Hamill’s de-aged vocal performance in both The Mandalorian and The Book of Boba Fett, as well as for an as-yet-unidentified character in Obi-Wan Kenobi. Lucasfilm has asked Respeecher to keep the name of that character a secret for now — and with so many franchise veterans returning for the series, there’s certainly no shortage of potential candidates.
The Hamden Journal spoke with Respeecher CEO Alex Serdiuk to better understand a process that to many fans no doubt borders on sacrilege: using technology to create entirely bespoke performances for one (or possibly even two) of the Star Wars saga’s most iconic characters. From the outset, Serdiuk emphasizes the human element behind the platform itself. “With our technology and our services, we can create a digital copy of a particular voice and enable another person to speak in this voice,” he explains. “And therefore we enable [studios] to scale voices, to age voices, and even resurrect voices for some projects.”
So, far from the mental image conjured up by terms like “artificial intelligence” and “voice cloning” — that of a sound engineer running lines of dialogue through a computer algorithm that then spits out audio files. Respeecher’s work on Star Wars is surprisingly performance driven. While Darth Vader himself may be more machine now than man, if the Ukrainian startup is supplying his voice (and remember, we said “if”) the essence of the character’s voice is still very much flesh and blood.
“There is no AI yet, and I don’t believe it would exist, that would let us use it just on a turnkey basis to create the performance we wish to create. […] We need another human voice [to provide] input because that human voice gives all the inflections, the accent, the style of speech and pace that AI is not good at creating,” Serdiuk insists. He adds: “Our system requires a performance on input, so it can be made by the same person that is being de-aged for example, or someone else. […] It takes all the performance, all the acting from what we call a ‘source voice,’ and then we do the conversion.”
What’s more, Respeecher’s pipeline allows actors like Hamill to record different takes just like they would on an actual set, which the company’s experts can later adjust at their end based on notes from showrunners like Jon Favreau or directors like Deborah Chow.
“With studio projects and films they might record thousands of takes for each line, and that means that we would need to convert all those [into the younger voice], send them back, and maybe send different versions because we used to train different models with different setups,” he says. “And sometimes we also need to meet creative expectations so they can just direct us, Can you try to make [a line reading] sound a bit more like that? and we would work to make it sound a bit more like they ask.”
So, is the ultimate aim to recreate a traditional performance through nontraditional means — in the case of The Mandalorian or The Book of Boba Fett, as though Hamill’s lines were somehow transmitted directly from the set of 1983’s Star Wars: Return of the Jedi? “The goal is to make it sound like it was recorded yesterday in the studio by the target voice themselves,” Serdiuk agrees.
Of course, there’s still the risk with voice cloning technology like Respeecher’s that the synthesized performance it produces will sound artificial, even if viewers aren’t entirely sure why. Serdiuk even admits that Hamill’s de-aged voice sounds “way, way better” in The Book of Boba Fett than it did in The Mandalorian, thanks to small yet significant improvements in how Respeecher’s AI model was “trained” to emulate the actor’s vocals. At the same time, the CEO is also quick to point out that while many fans complained about the visual effects used to portray the 20-something Luke Skywalker in The Mandalorian finale, few realized the Jedi Master’s voice was also synthetic until Lucasfilm spilled the beans in a Disney Gallery “making of” special several months later.
Serdiuk mentions that Hamill’s convincing de-aged vocals in The Mandalorian and The Book of Boba Fett were all the more impressive considering the quality of the legacy assets Respeecher received from Lucasfilm. “That [data] was quite old, so we had something from tapes, we have some old ADR recording, something from a video game,” Serdiuk recalls. “And the thing is you need to get this data trained in your model to enable it to produce the output quality that would fit into a modern production. In many projects which involve aging or resurrecting [performer’s voices], this might be the main challenge because lack of data and quality of data introduce additional blockers in terms of making it sound good.”
The Respeecher CEO maintains that overcoming these data-related hurdles has been worth it, now that industry heavyweights like Lucasfilm have embraced the work they do. “[We] started with the idea of building a synthetic speech [platform] on the level where it would go through sound engineers and Hollywood studios and land in big productions. So, when they accept our sound, when they say something good about the sound we were able to produce — and that’s a very complicated and heavy technical challenge, to make synthesized speech on the level where it would be indistinguishable from a real recording — in such cases, it really encourages us and helps us grow,” he says.
I put it to Serdiuk that the growing acceptance of voice cloning technology might mean that studios no longer call upon talented soundalikes to stand-in for deceased actors, either. For example, wouldn’t Rogue One: A Star Wars Story — in which Guy Henry imitated the voice of the late Peter Cushing as Grand Moff Tarkin — be the kind of project that would automatically land on Respeecher’s desk now? Not necessarily, according to Serdiuk, who sees Respeecher’s voice cloning technology as one of several viable options at filmmakers’ disposal.
“There are always different visions of how things should happen in the industry and the fans have different thoughts about how they should happen. I wouldn’t say that [Respeecher] is very well suited to be in charge or judge [which approach is best],” he says. Serdiuk also made it clear that should Lucasfilm ever call upon Respeecher to re-create the voice of a dead actor, the company would only do so with the approval of that actor’s estate. And while Serdiuk’s response will set voice actors at ease for now, the fact that Lucasfilm is a repeat Respeecher customer suggests the startup has made serious in-roads for voice cloning technology.
Indeed, Serdiuk already has a vision for Respeecher’s future that extends beyond de-aging actors’ voices, although he remains adamant that what the company has planned will expand filmmakers’ creative horizons, not shrink them. He talks about democratizing the technology so that smaller film and TV studios and video game developers can use it to stretch their budgets further. He also speaks enthusiastically about the groundbreaking health care applications of Respeecher’s platform — even citing one instance where the company is collaborating with a voice actor who has lost their voice to allow them to perform again.
Yet looking to the future doesn’t mean Serdiuk has lost sight of what it means for Respeecher to be part of a certain space opera set a long time ago in a galaxy far, far away. “It’s something special. I mean, you’re part of this story. We can fairly say that Star Wars broke quite a lot of ground for Hollywood, right? They’ve been disrupting the industry, from a technical standpoint, from the very beginning, and the way they do their movies is exceptional. So, it’s a big honor to be able to work with those people and learn from them.”