It’s a long-held fear amongst the workforce of the world that they will one day be replaced by machines. Self-service checkouts have long been replacing checkout workers, the rise of self-driving cars is threatening the transport industry, and even wars are now fought by remote drones in large parts of the world.
The creatives of the world historically assumed they were above this worry, but time is making fools of us too. AIs might not be able to write or speak with a human-like tone yet, but they can easily string some words together to make an understandable sentence. Even then, many believe the human touch is something that only a proper actor can provide, but that may not be the case for much longer.
This brings us to the world of Altered AI, which promises game developers tools that can “create compelling, professional voice performances.” Its library features roughly 20 professional actors, with hundreds of generic voices to populate game worlds.
All you have to do is submit a recording of the lines and how they should be said, and hey, presto! – a voice “performance” is created. In the same manner, you can take an actor’s recordings and manipulate the tone, type, and more to create an entirely different vocal performance. The website shows us the results, where a male actor has their lines completely transformed to a feminine-sounding voice.
At this point, the ways in which this can be abused to cut actors out of the acting process should be clear.
“I never believed in human replacement,” Altered AI CEO and former Google employee Ioannis Agiomyrgiannakis tells GLHF.
This isn’t just a test either – it’s already been used in major titles. The Ascent is a cyberpunk twin-stick shooter that uses several AI voices. Even in the triple-A space, Ninja Theory – developer of Hellblade 2: Senua’s Saga – is using this technology, although the details on how remain secret for now.
Like many potentially unseemly practices in game development, most studios aren’t keen to advertise their use of these services. It’s extremely common for NDAs to be put in place that prevents Altered AI from touting their partnerships. Most studios are usually all too happy to sing about the new and innovative tech their games use, so their desire to hide this one is telling.
“What we are making are tools that allow people to do performance by themselves,” Agiomyrgiannakis explains. “People in the gaming industry use us for prototyping. When you have a dialogue, you have a level of imagination. But when you take the dialogue to the voice actors, it comes back and doesn’t sound as dynamic as you wanted it to. So there’s a gap between how the writer imagines the dialogue, and how the dialogue plays out. We provide an intermediate step where they can prototype the dialogue and have a checkpoint before they hit the studio.”
This sounds reasonable in theory, but it doesn’t make much sense in practice. Would you have an amateur record Keanu Reeves’ scenes for him first so he can see how it should be done? Of course not – you hire an actor for what they bring to the role and how they bring your dialogue to life in a way unique to them.
“Maybe you’re familiar with line reads – saying the line the way you think you want it to sound so the actor can copy it,” says Sarah Elmaleh, who plays Gears 5’s Lizzie Carmine. “Line reads are usually an unfortunate last-ditch failure of direction, and ‘copying’ usually sounds dead on its feet. When you hire an actor, you get so much more than a sound. Great dialogue doesn’t just come from the mouth, it comes first from the heart. Mouth-to-mouth is good for CPR, bad for acting.”
“Actors hook into the environment, relationships, history, and intention, and you can hear all of those things in their delivery. You can either help guide them to those things or you can force them to reverse-engineer backward from the ‘sound’ of your read. The read does not work without some or all of those things authentically in place. Some of the most wonderful moments in a session are when an actor surprises you and maybe even themselves with a deeply motivated choice you never expected.”
Red Dead Redemption 2’s Roger Clark describes the pushback against automation and AI as a futile endeavor. He likens it to the story of King Canute attempting to stop the tide from coming in.
“I feel AI is a viable solution for some, but I’d be lying if there weren’t some concerns as to how it may impact actors and their capacity to work,” Clark says. “I feel that humanity can not be digitized. We are all experts on being human and can sniff out imitations with impressive speed and accuracy. I am interested in what AI can do, but its capacity for imitating real people is alarming, and I know we have all speculated the potential damage that could do – legally, financially, and reputably.
“I think the true test will be when AI can seamlessly ‘work’ with real actors and the audience won’t be able to tell the difference. When AI can play off of a real person and adapt and react truthfully as another actor could, then we’re in trouble. My favorite thing about performance is when an actor surprises me but remains loyal to the character, situation, and to basic human nature. People are complex, fascinating creatures and will always have the edge when it comes to reactive adaptability. When two or more actors are working together and have a genuine connection, it’s magical. Human audiences can smell authenticity a mile away — at least I sure hope so, otherwise I guess it’s time to go back to bartending. I just hope they don’t have a machine doing that as well.”
Unfortunately, Mr. Clark, bartending is no safer – robots are already serving customers across the globe.
He touches on a good point, though, as two actors bouncing off each other in a recording booth is something no current AI can replicate. Take Firewatch, for example. This story about a man who works in the solitude of nature to escape trauma became the critical darling that it did in huge part to the collaboration between the game’s two main voice actors as they spoke over an in-game walkie-talkie.
“A human performance is at the heart of great games,” says Cissy Jones, who plays Firewatch’s Delilah. “A synthetic performance is soulless by definition, thus losing the art, the collaboration, the creative spark that comes from people working together to create the narrative and emotional immersion that gamers and broader audiences deserve. Working on Firewatch was a collaborative process every time we got in the booth, reworking lines and making each other laugh to figure out what worked – together. The reason people hire voice actors is because we bring the unexpected. We make words on a page come to life. That’s the magic.”
One way Altered does things is by text-to-speech, taking inputted text and churning out an audio file for it. This can be useful for prototyping, as traditionally someone at the studio would take time and effort recording these lines simply as placeholders. It’s a good piece of time-saving tech, but even removing humans from this step can cause studios to miss big opportunities. Going back to Ninja Theory and the original Hellblade, it was only after video editor, Melia Juergens, recorded placeholder lines for the game that the team realized how perfect she was for the role. Juergens’ performance has since become award-winning, making it rather ironic that Ninja Theory may be using this technology for the sequel.
During a presentation, we were shown a voice sample that Caitlyn Oenbrick Rainey – Altered AI’s in-house actor – recorded earlier, and altered it to sound like a 40-year-old man from the Deep South. She’s also able to convert a line into a whisper or a shout. The tech can even translate a line into a different language while retaining the original accent. It’s impressive stuff that focuses on the prosody of the performance while transferring pitch, dynamics, and energy.
This shows how Altered AI can be used away from prototyping. This is where we see how the technology can, will, and maybe already has progressed into replacing actors. Background characters are an obvious place to use this that would cut costs significantly for studios. The section in credits for “additional voices” may soon be a thing of the past.
In one suggested use case, main characters can be played by actual actors, while crowds and other additional characters are populated with AI. Agiomyrgiannakis claims this could lead to more work for actors, not less. He says he’s already been approached by developers who originally planned to use text bubbles for dialogue, and instead opted to use AI. And, of course, someone has to feed the AI. As I mentioned earlier in the piece, 20 professional actors have already signed their voices over to the company.
“We hide them,” Agiomyrgiannakis says. “We’ve hidden them so hard that we don’t even know. We never got exposed to their names.”
If you take a look at the site, you’ll quickly see he’s not joking; these identities are so well protected you’d think them secret agents. Everyone in the listings is just named “Dennis” or “Rod” with generic avatars. Once again, it’s very telling how many of these actors, some of whom I’m told are award-winning, are afraid of the backlash from being publicly involved with Altered AI.
“Bloggers didn’t kill the newspapers,” Agiomyrgiannakis says. “YouTubers didn’t kill the TV. People just consume more nowadays.”
While this is certainly true, it’s not an apples-to-apples comparison. These background roles and “additional voices” credits are how many actors get their start in the industry. Sure, YouTube hasn’t killed the TV industry, but YouTube isn’t actively replacing jobs in TV either. It’s especially important now more and more studios are looking to Hollywood for their big roles, so dedicated actors for games have less room to break through.
“AI works for minor game performances,” The Expanse’s Elias Toufexis explains, “but it still doesn’t work for real performance. Go watch the Boba Fett episode with Luke Skywalker in it – his whole vocal performance is AI and it’s terrible. If they need it for ‘grenade!’ and ‘get down!’ in Call of Duty-type games, it’s fine. It’s going to hurt a bunch of new voice actors, though, because that’s a window in for a lot of us.”
Horizon Zero Dawn actor Ashly Burch agrees. “I completely understand the desire for affordable VO for indie developers,” she says. “What I think a lot of people don’t know is that SAG-AFTRA (the American actors’ union) has a low-budget agreement to address this issue. It’s specifically designed so indie developers can get access to quality VO without breaking the bank.
“Artistically, you’re never going to get a truly dynamic and compelling performance from an AI. A few combat barks? Maybe. But if you’re looking for something human and nuanced and alive, AI isn’t going to cut it. Low-budget or smaller titles are where a lot of new VO folks get their start. If devs transition to AI, an entire entry point for young artists is being squeezed out.”
The team at Altered AI see things differently. Agiomyrgiannakis says he would be concerned “if the volume of NPCs was fixed,” but he points out that with fewer actors to pay, small developers can create more densely populated worlds. He argues that the same portion of a game’s budget will be spent on actors, but with the small roles provided by AIs, the paid actors will be able “to record 10 times more.”
In Agiomyrgiannakis’s vision of the future, actors who are starting out won’t be replaced. Instead, they will work for him. “Who’s going to drive the voices?” he asks. “We need actors to drive the voices.”
Altered AI doesn’t scrape from the internet as the AI art apps do – it gets new performances plugged into it, which you can then synthesize and change. But since there hasn’t been an open dialogue between the unions, the actors, and the tech creators, no one knows what these jobs will look like.
They may not provide the same benefits as an “additional voices” credit does today. In one possible future, these young actors working for AI companies could be completely anonymized and altered between recording and usage. With no credit, they have no potential for getting bigger roles, leading them to be stuck in the AI business in perpetuity. All of those major roles will just go to people who are big names in the industry. I like Nolan North and Troy Baker as much as the next guy, but we need more voices and more variety in games.
One running theme among the actors I spoke to is the idea that AI software can never capture the heart of a real human performance. I asked Agiomyrgiannakis if he thinks his AI can capture the raw emotive power of something from The Last of Us, where Joel is bugging his dying daughter.
“I believe so,” Agiomyrgiannakis replies. “If you check out our videos, you can see, for example, a video called Lincoln.”
I’ve watched the video in question, and to put it bluntly, I’m not convinced. I doubt this technology will be stealing major roles anytime soon. The video shows a scene from Lincoln intercut with the AI’s best Daniel Day-Lewis impression, and it feels like someone’s first attempt at voice acting. At 1:15 it completely mispronounces the word “blood” and all of the emotive power from the passionate shouting in the original is completely lost.
It’s not like the effects of this are still on the horizon either. Actors are concerned about the effect this tech is already having on the industry. No matter how big a game’s budget or how much money a publisher is willing to spend on getting streamers to play a game, costs will be cut at almost every opportunity. Clauses are already popping up in actor’s contracts that allow companies to use their work in whatever way they wish, forever, with no compensation for the talent. This is one of the many ways in which the games industry is woefully behind Hollywood.
“I can understand wanting to make things cheaper and easier for people when they may not have the budget, but actors have always worked with companies to find fair practices,” Marvel’s Spider-Man actor Yuri Lowenthal explains. “Underestimating the actor’s contribution can lead to exploitation, and could be avoided by starting a conversation with actors so we can make it work for everyone. As of now, I don’t think anyone from these AI companies has reached out to us as a whole, to see if we can agree on what might be fair use and fair compensation for the use of our voices, our performances.
“There is no morally sound financial shortcut here. I’ve, of late, started to catch very vague clauses in actors’ contracts that allow companies to use our performances for whatever they want in perpetuity, and maybe already have done so in order to develop this technology. In fact, I know an actor who does a lot of performance capture and voice work and she has seen her very specific movement show up in games she never even worked on, which means her data sets were either repurposed for other projects she never signed off on or, even worse, sold to other companies without her knowledge. This is a scary precedent that has already been set, and I want to start a conversation with AI companies about how we could protect actors, and again, the ecosystem of storytelling.”
The concerns go beyond financial, though. Altered AI could open actors up to having their voice or likeness attached to something they’re morally opposed to, and may not have agreed to if they knew about it. Developers have already shown that they’re not above such behavior, so it’s a legitimate problem that needs to be addressed sooner rather than later.
“Too many companies are asking actors to sign horrible contracts with zero input on the final product for an often-crappy one-time buyout,” Cissy Jones explains. “I definitely understand how tech like this can be intriguing for indie games, but if we have no guardrails as actors, our voices could end up being used for offensive materials or inappropriate casting.
“We’ve seen companies that slip in clauses that give them the rights to use recordings from a non-AI session that will be – or has already been – used to create a synthetic voice. We are finding all manner of hidden clauses, buried details, and snuck-in verbiage, enough to make your head spin. It’s incredibly disheartening because very few companies – if any – are asking for the actors’ input into what is fair.
“Everyone – union, nonunion, and everything in between – needs to be aware of what is at stake and what their rights should be, and they need to be aware of what their contracts actually say. The broader implications of this technology are frightening. It’s a path to misuse and deepfakes. There need to be protections and guardrails for all of us to prevent abuses. I think there could be a right way for this to be done but we all need a seat at the table.”
When I spoke to Benjamin Byron Davis – the actor who plays Dutch in Red Dead Redemption – about the tech, he told me a story about when he used to have a Facebook account. He had a professional headshot done for his portfolio and he ran it through a filter to make it look like a painting, then used it as his Facebook profile photo. One of his friends commented on the photo, saying it made them feel sad.
“What made him sad about it was that there was no need for the painter,” Davis says. “And in not having a need for the painter, you don’t have the need for all the experience that is required to develop as a painter. And so now we have an image that looks very much like a beautiful painting, but it is done by an algorithm or a machine. I can’t pretend to know exactly how any of these filters work. And certainly, there is artistry in creating these filters. But yeah, there is something sorrowful in being able to create this outcome without requiring the development of the tools that make training, so essential to our human experience. And from there, you then have to ask the question of what that then does to the audience’s palate. Do we cease to value art itself, and what an artist can do, and what an artist does?”
Davis and his brothers are a clear example of how unique voice talent can be. Davis has brothers who are built like him and sound similar to him, but they can’t provide a performance like he can. By the same token, he can’t play basketball as well as his brother Alex or talk about law like his lawyer brother, Joshua. Those kinds of skills have to be taught and trained over a lifetime.
“Now, again, this technology is coming whether anybody likes it or not,” he says. “There is no standing against the tide on this, it is coming. But I do think it’s important to be clear-eyed about what we may be losing.”
This kind of technology doesn’t just affect the games industry, and Altered AI is far from the only game in town. Once upon a time, I would’ve had to spend hours transcribing all of these recorded interviews myself before I even started putting this piece together, but now I can shove them into Otter AI and have a good transcription in no time. It’s not perfect, but skimming through to correct the AI’s mistakes is still a lot quicker than typing it all out myself. AI does raise genuine concerns for artists, but it can be a huge benefit to speeding up workflow, ideation, and photo bashing. While it’s exploitable, there’s nothing wrong with using AI as a base on which you create something truly transformative.
If this tech continues to expand and develop, then it may one day be a huge benefit to actors and creatives from all industries. However, they need to be a part of the process and discussion for that to happen. Actors and the unions are as crucial a cog in the machine as anyone, but the way developers are currently using this tech makes it seem like they want to cut them out entirely.
Sarah Elmaleh remembers Benedict Cumberbatch playing the dragon Smaug in The Hobbit movies and hearing his voice processing in real-time, and points out how this tech isn’t the same. “That’s an actor wearing a costume,” she says. “When you imitate an actor, that’s an invocation – generally there’s an instinctive dual perception on the part of the audience of both the artist present and the artist in reference. But when you purposefully and successfully elide that duality, when you wear actors as costumes, that feels more like… a skinsuit? An uncanny and dishonest simulacrum with none of the original’s expressiveness, none of the maturity, spontaneity, and particularity of their artistic choices moment-to-moment. To me, it’s Cronenbergian.”
I reached out to SAG-AFTRA to get a closing comment and a spokesperson came back with the following:
SAG-AFTRA has contracts for video games and all other forms of voiceover that let producers and developers at nearly every budget level hire professional voice actors like those interviewed for this article. As technology continues to change the entertainment and media landscape, we will continue to create contracts that are fair and protective for performers and responsive to the needs of the companies that wish to employ them. We are also adding or negotiating language into our existing contracts that provides critical protection from misuse or unauthorized use of member’s voice or image through technology.
Protection of a performer’s digital self is a critical issue for SAG-AFTRA and our members. These new technologies offer exciting new opportunities but can also pose potential threats to performers’ livelihoods. It is crucial that performers control exploitation of their digital self, be properly compensated for its use, and be able to provide informed consent.
It is critical that performers who work with these companies understand what they are agreeing to, including whether any ethics policy or safeguard could be changed in the future, and ensure they are protected. For example, Altered AI is based in London — US-based performers need to know how it will impact them in the event of a dispute.
Among the key provisions we include in AI-related contracts are: safe storage of a performer’s recordings and data; usage limitations and the right to consent — or not — to uses of the performer’s digital double; transparency around how the digital double will be used; and appropriate compensation.
Most importantly, a SAG-AFTRA contract puts the power and expertise of the union behind the performer, both in negotiating and enforcing contracts. We have lawyers and staff who focus on digital and AI technology. We know that change is coming. SAG-AFTRA is committed to keeping our members safe from unauthorized or improper use of their voice, image or performance, regardless of the technology employed. The best way for a performer to venture into this new world is armed with a union contract.
King Canute might not have been able to push back the tide, but he never tried asking his advisors to negotiate with it.
Written by Ryan Woodrow on behalf of GLHF.