For the second piece for the column ‘Not just translation’ by Studio Moretto Group, I would like to take a brief look at the world of dubbing, given the uproar caused by recent news on the subject of artificial intelligence and the voices of celebrities, actors and voice actors.
Dubbing is an audiovisual language service in which original audio is replaced or modified to render audio-visual content usable in target countries in which different languages are spoken.
The umbrella term dubbing is used to refer to all services that fall within this category, although, in reality, they can involve very different techniques and areas of use.
Before getting to the heart of the matter, it is therefore best to explain these differences, so that it is clearer for everyone the services that will be more affected by the introduction of new technologies and those that will be less affected, according to the experts.
Specifically, dubbing refers to the following main services:
- Lip-sync dubbing can be based on a script provided by the client or the translation and adaptation of the original content. It is recommended for broadcasting (films, TV series and adverts).
- Voice-over, which in turn is an umbrella term encompassing various services, including:
- UN-style Voice-over, where the voice actor is heard in oversound, with the original audio still audible in the background. This technique is used for news reports, speeches, interviews and certain specific TV programmes.
- Voice-off, where the original audio is no longer audible and is recommended for podcasts, radio adverts and documentaries.
- Lektoring, a niche service popular especially in Poland, where one person reads all the translated dialogues, without conveying emotions.
Having made these important distinctions, we can now address the elephant in the room: artificial intelligence in the world of audiovisual services, and in particular dubbing.
Now that it has marched into the world of language services, AI is being used increasingly in the niche area that includes dubbing, which, at first, was thought would escape the flood of new technologies.
In fact, in the collective imagination, it is difficult for an algorithm to replicate the many facets of the human voice, in terms of expression, intonation, cadence and pronunciation; and indeed this is still the case.
What, however, most frightened professionals in the industry was the ability shown by some software to replicate the voices of the original language actors and voice actors with unexpected accuracy, although with obvious shortcomings compared to dubbing.
For instance, Christian Lansante, the historic voice actor for the character Rick in the series ‘Rick and Morty’, complained that he had seen some clips from the series that he had never dubbed, in which his character’s voice had been replicated using AI and added to the Italian version. The same was noted by David Chevalier, voice of the other lead character in the same programme.
Is human activity in the world of language services thus destined to succumb in the future and be replaced by software and algorithms, or is this just a momentary transition period, a passing fad generated by the current hype?
In my humble opinion, neither of the two hypotheses — representing the worst and best case scenario for industry professionals, respectively — is correct.
On the one hand, although there is now a reasonable certainty that, with the advancement of technology, AI will become an important tool in various fields and sectors, we need to define the field of use of such tools: means to support human activity and accelerate processes or direct replacement of the service provided by humans?
For example, we can look at the impact of the e-book on the publishing sector: initially publicised as a substitute for the paper book, it was later recognised as a useful tool to support the entire industry following the 2008 crisis. It was in fact adopted by publishers to expand their audience by acquiring new users attracted by this innovative medium, while, at the same time, the majority of loyal customers stuck to paper. This allowed the industry to stay afloat during this difficult period, following which there were even signs of slight growth.
The same thing could happen today with the world of language services: for instance, AI could make dubbing and subtitling accessible to users who — for reasons of cost or time — have so far refrained from buying these services for their communication.
On the other hand, it is important to understand from an economic point of view whether the competitive advantage of AI could lead to concrete savings in the medium to long term with respect to the cost incurred for ‘human-made’ services, or whether such savings would only be apparent and short term, in the face of a future alignment of prices to current levels for the quality of service provided by humans.
Without these savings, the whole matter would return again to the notion of quality that we have not yet deliberately addressed.
In fact, these arguments only make sense if it is taken for granted that the quality that AI will be able to offer in the future can really achieve the standard currently assured by humans… and that is far from a given. Indeed, at the moment, the state of the art views human dubbing as still of much higher quality and, as the famous voice actor Luca Ward says, “identification and empathy cannot be reproduced artificially, simply because machines don’t have a heart.” Therefore, according to his prediction, this goal is unlikely to be achieved in the future.
Personally, I rather agree with Ward and this, in my view, remains true in all areas where there is, and will be, the widespread use of AI and similar technologies.
Undoubtedly, these tools, which are already very useful today, will help even more in the future to increase production and reduce the time it takes to deliver standard services.
For example, they will help speed up the creation of Voice Over especially for simple, short and replicable content, such as tutorials and video courses.
Other services — especially in entertainment, where the acting and human component is characterised by strong emotionality (such as dubbing for large film productions) — will always require a human interpreter, who then may rely on the support of artificial intelligence, especially in post-production.
The question, therefore, shifts from the notion of competition to one of mutual benefit: for some specific low-profile services, AI offers undoubted advantages in terms of time and money; human translation, on the other hand, ensures consistently high quality. However, there is more: the main advantages of human translation will always remain effective communication and depth of emotion.
Therefore, there is really no need to choose either one or the other, when we can instead make the most of their respective qualities and open up the services to a greater number of users?
Do you agree with my analysis or, on the contrary, do you believe it is more likely that the scenarios described above — or even further hypotheses that I have not considered — will occur?
Let’s talk about it on LinkedIn @Luca Moretto