Machine Translation: When to use it (and when not to)

Man vs Machine

Neural networks and deep learning have changed the face of the translation industry. Businesses all over the world have harnessed the power of AI with machine translation tools that help cut costs and translate large volumes of text faster than ever before.

But when it comes to certain types of translation, machine translation could still let you down. Here’s a complete guide to the pros and cons of automated translation tools and when you need to use a human translator.

What is machine translation (MT)?

There are currently two main types of machine translation: statistical machine translation (SMT) and the more recent neural machine translation (NMT).

Both SMT and NMT rely on parallel data sources: texts that have been written and translated by real people in two different languages, which can then be compared side by side (known as “corpora” in linguistics).

SMT analyses these texts and matches up equivalent words and expressions, while NMT “learns” from them using a system of neural networks. This AI-based system not only associates words with concepts such as type (verb, noun…) and even register (formal, slang…), but also updates itself based on user corrections.

Context is king

As a professional translator, I often use previous translations and even free online resources like Linguee for inspiration when I’m playing around with a particular phrase. Why? I can see how various terms have been translated in similar contexts and make my own judgement as to what fits.

Using parallel texts and deep learning was a gamechanger for machine translation. Even free tools like Google Translate started producing more fluent, “human-sounding” translations, while paid-for solutions like SDL and Microsoft Translator have revolutionised translation for business. As a result, there has been a notable increase in the number of companies using neural machine translation in the past few years.

The benefits for businesses

The main benefit of using neural machine translation is that it’s fast. Once you’ve trained a neural system, it can produce high volumes of natural-sounding translations much faster than a human translator can.

Although professional translators are still required to edit the translations (a process known as “post-editing”), NMT guarantees consistency and boosts productivity, while also keeping costs down.

Sounds unbeatable, right? For simple tasks, NMT can be the best solution. But there are still multiple challenges facing even the most sophisticated solutions and some major pitfalls to avoid…

The limits of machine translation

Deep learning has made it possible for machine translation to process sentences in a more intelligent way. But as cognitive scientist Douglas Hofstadter explains in his article “The Shallowness of Google Translate“, it is still decoding sentences rather than understanding them.

Why does this matter? Because languages are more than just a set of linguistic patterns and rules. They are governed by social convention and cultural context.

The problem with pronouns

In French, beyond the adult-child relationship, it can be tricky for non-native speakers to know when to use the formal “you” (vous) or the informal (tu). It isn’t quite as straightforward as family and friends versus strangers or professional relationships. A Parisian friend of mine addresses her in-laws using the more formal vous, while some of my clients in media and advertising start using the informal tu straight away (even if we’ve never met).

In Greek, a man in his forties might jokily use the informal εσί (esi) instead of εσείς (eseis) when meeting someone younger, but the younger person would always reply using the formal eseis, otherwise it would sound odd and even disrespectful.

As translators, we rely on both our knowledge of languages and our interactions with the people who speak them. It isn’t enough to know that a language has two forms of “you”. You have to get a feeling for when and where to use them based on the situation, which makes it impossible for NMT engines to predict. Get it wrong and you risk causing serious offence.

Idiom

NMT still has to rely on phrase-based translations of idiomatic expressions. The French expression “poser un lapin à quelqu’un” (to stand someone up) literally means “to put a rabbit on someone”. If you type it into Google Translate, you’ll get the right translation. But this is only because it has been manually corrected by users. The system itself has no way of grasping the overall meaning, so it couldn’t have come up with this translation itself.

An automated translation might look grammatically correct, but that is no guarantee of accuracy. The Greek phrase “μου κάθεται στο στομάχι” will be translated literally as “it is sitting on my stomach”, but it actually means you can’t stand someone (or something gives you indigestion). The fact that you leave the pronoun out in Greek doesn’t help!

Jokes and wordplays

It’s particularly tricky if the phrase is a play on words (try translating “Êtes-vous ravis au lit?” from an ad for Panzani ravioli, or “J’y suis, thé [t’es] ou?” from an ad for Monoprix tea and you’ll see what I mean). NMT can only predict the likely word order. It can’t deliberately play around with it for creative effect or capture the double meaning.

Subtitles: it’s all in the timing

Cultural context and idiom are just some of the many challenges facing MT when it comes to audiovisual translation. Subtitling, in particular, requires a lot more than simply matching up or learning from previously translated content.

Timing: there are strict guidelines on reading speed and how long subtitles should appear on screen. This means that the translation has to be condensed and rephrased so that the audience can follow what is going on without having to read every word.
Breaking the rules: you can’t always follow the guidelines absolutely. Subtitles aren’t supposed to appear over shot changes or scene changes, for example, but shot changes are sometimes impossible to avoid depending on how the film was edited.
Adaptation: wordplay and cultural references have to be replaced with something that is more familiar to the target audience to preserve the effect (I once had to find an American equivalent to an obscure French Canadian TV show called Grujot et Délicat).
Non-verbal meaning: subtitlers rely heavily on what the speakers are doing on screen (reactions, facial expressions, body language, tone of voice, rhythm, etc.). You have to be able to capture all this information to get the meaning right. (When French speakers say something is “terrible”, do they mean it’s “awful” or “great”?)

It is up to the individual subtitler to find creative ways of translating the material within all these constraints.

What happens when it goes wrong

Even when it comes to translating the basic meaning, NMT doesn’t even come close. I’ve had to completely re-write even the most fluent-sounding machine translation because the subtitles – which may well have worked in a different context – had absolutely nothing to do with what was being said on screen.

Use of “machine-assisted subtitling” is a widespread problem, as this recent article about the Deauville American Film Festival shows. The French subtitles were riddled with awkward word-for-word translations, mistranslations and an incoherent mix of formal and informal pronouns, courtesy of the latest “cutting-edge solutions”.

Unfortunately, that hasn’t stopped language service providers turning to MT for audiovisual content and slashing their rates (and standards) accordingly. The Swedish subtitlers’ union Medietextarna recently blacklisted a company called lyuno for doing just that.

The human touch

Most businesses, however, have more realistic expectations of what machine translation can do.

A recent survey by the EU showed that European SMEs tend to use machine translation for information purposes only (like understanding websites and social media). Most still choose to work with professional translators for their core business activities (sales, marketing, contracts, negotiation, etc.).

This is because only translations with a human touch guarantee complete accuracy and style.

Who do you trust?

You may think machine translations are impartial and are therefore more reliable than individual translators. But that is not necessarily the case.

Take audio recordings, for example. What someone says can take on very different meanings when taken out of context. In a similar way to subtitling, the lack of visual and background information presents a problem for machine translation.

Lawyers often need to have recorded conversations translated so that they can be used as evidence. The final translation not only has to be faithful to the original transcript but also needs to be agreed with the other party.

In this situation, you need expert linguists who can discuss and justify how they translated the material and interpreted its meaning.

Avoiding bias

Another concern with AI technology in general is bias.

Algorithms have been shown to reflect the prejudice and stereotypes that exist in society and online. The recent automatically generated images of Alexandria Ocasio-Cortez in a bikini are just one of the many examples of this.

But it isn’t just search and autocomplete algorithms making biased assumptions about our identities. Machine translation has also been shown to inherit bias from the data sources fed into them.

The team behind Google Translate has come up with some basic solutions for avoiding gendered concepts, including feminine and masculine options for translations from Turkish to English (“He is a doctor”/”She is a doctor”, etc.).

But there is still a risk that automatically generated translations will contain offensive or biased language, especially if they rely on unknown data sources (parallel texts compiled by the solutions provider or found online).

At a time when brands are becoming increasingly aware of unconscious bias and the need for diversity and inclusivity in their communications, marketers can’t afford to use machine translations that rely on outdated or biased data sources.

A question of style

Finally, automated translations are generally devoid of style. They guarantee consistency but you don’t get a unique tone of voice. It is labour intensive (and therefore expensive) to train NMT engines to learn the specific style of a publication or brand. Although efforts have been made to develop custom engines for businesses (see Microsoft’s latest solutions), the results are still inferior to the work produced by a professional translator.

The future of translation

While increasingly sophisticated machine translation hasn’t replaced translators, it has had a profound impact on our work.

One of the trends I’ve noticed since I started out back in 2011 is diversification. Translators are increasingly branching out into related fields like writing and other creative professions that are perhaps valued more highly than translation and less affected by automation.

Clients are also actively seeking translations that move further away from the original “so it doesn’t look like Google Translate”. This doesn’t mean they were expecting a nonsensical word-for-word translation. It means they want something more than the kind of basic translation you get with NMT and post-editing. They want something original that adds value and really speaks to the target audience.

The nature of our work is changing. It’s more challenging but it’s a better use of our skills. Freed from the constraints of repetitive translation work, we have more freedom to play with style and come up with new linguistic and creative solutions.

Given that today’s systems continue to rely heavily on human-generated texts and translations, it’s clear that professional translators will continue to stay one step ahead.

Recent Posts

Recent Posts.