Machine translations including Bing and Google Translate have been found to produce unintentionally sexist translations. The problem? It’s all about the algorithms.

A case study conducted as part of the Gendered Innovations project (launched at Stanford University in 2009) has revealed that some gender neutral phrases put into machine translators become gender biased in the translation. This includes assuming people holding certain professional positions are female and others male.

Is gender bias unavoidable?

The study acknowledges that although machine translation is improving all the time, some errors will require non-incremental solutions to fix as they are ingrained in “fundamental technical challenges”. This is because systems including Google and Systran “massively overuse” male nouns and verbs no matter whether it is a male or female subject being referred to in the text.

Londa Schiebinger, who runs the project, gives the example of translating the phrase “a defendant was sentenced” from English to German. While the gender of the subject is not specified in the original phrase, when put through Google Translate “a defendant” becomes “ein angeklagter”; the masculine form of the word.

However, the male version is not used every time. When “a nurse” was translated from English to German, the result was “eine krankenschwester”, which is the feminine form. This assignation of the masculine to some words and the feminine to others is what has given rise to the claim that these machine translations are sexist – however unintentionally.

The author noted that the gender bias problem mainly arises when translating something from or into English. This is because English is only marginally gender-inflected while many other languages, particularly the Indo-European family, are strongly gender-inflected.

Schiebinger tested out machine translations by running an interview she did with Spanish newspaper El País through the system. On two occasions in the Google Translate result and three in the Systran one, she was referred to as a male.

A matter of algorithms

The case study acknowledges that when a person is listening or reading they can use the context to determine what gender the subject is. A machine translator is not able to do this and so will assign gender based on the frequency particular words have meant a certain gender in documents uploaded previously. So, if “a nurse” is most commonly used to refer to a woman, the female version of the word will become the default in translation.

Co.Labs spoke with engineers working on Google Translate, who acknowledged statistical patterns were used to allow the tool to determine what gender was being referred to. Should the text include the word “dice”, which is Spanish for “says”, the algorithm will not only assess the frequency that this is historically used to refer to a male or female speaker, but also the other words in the inputted text. How often these words are either male or female will also be taken into account when determining whether to translate “dice” to “he says” or “she says”. Unfortunately, this means that the current algorithms are very reliant on stereotypes.

Despite the machine translator getting it wrong, in the text Schiebinger ran through Google Translate there were clues the machine could have used to determine her gender. For instance, her first name is a woman’s one in English and there were also feminine descriptive words used in the article. If a machine translator worked like the human mind it would have picked up on these clues.

This is not the first time that the unintentional gender bias of machine translations has been spotted. In January 2012, The Jane Dough website revealed that when the phrase “men are men and men should clean the house” was entered into Google Translate to be translated from English to Spanish, the search engine suggested: “Did you mean: men are men and women should clean the house.” The discovery was made by an unnamed Google Translate user, but Jane Dough tested the claim and got the same result.*

The human touch

Of course, in some cases it is acceptable to use a particular gender in the English language. For instance, cats are often referred to as “she” while dogs are frequently “he”. Meanwhile, babies are often referred to as “she” when the gender is not specified. However, when it comes to professions, getting the gender wrong is not only incorrect but could even cause offence.

For this reason, it is best to have a human translator at least proofread your completed document. If the text you want translated is for publication it is advisable to have the whole thing translated by an expert who will understand the context in a way a machine cannot. They can also lend their own knowledge of the subject the text focuses on to the translation.

However, even if the translation is only being used for reference purposes it is useful to have a linguist check through it so you can make sure you have understood it correctly. Mixing up the gender of a speaker or subject could cause problems, even if you only need the information for research.

The technology behind machine translations is no doubt worthy of admiration. After all, it returns a translation to you within a matter of seconds. However, it’s the human touch that can ensure mistakes like accidental sexism aren’t made. To avoid such errors, always ensure your finished translation has benefitted from the human touch.

*When Language Insight tested it out, no such suggestion was made, which is likely to mean Google has fixed the error since it was reported.