Google Translates Gender

Feb 13, 2012 by

[Thanks to David Pesetsky for inspiring this post]

The topics of Google Translate and of translating gender have been discussed in this blog before. I have argued that Google Translate fails to understand language (as many journalists claim it does) or to provide adequate translations. And gender is one area of language where even human translators can either fail abysmally or show extreme creativity. So what happens when Google Translate is faced with the task of translating gender?

Let’s take, for example, some simple sentences in English, a language that has no grammatical gender system, and attempt to translate them into Russian, a language that has a three-way grammatical gender system. Its three genders are masculine, feminine, and neuter, but only the former two will be of interest to us here.The verbs in the English sentences are in the past tense; in Russian past tense verbs, which are historically participles, show agreement with the subject in gender (and number). Since the actions in those sentences presuppose a human agent, we expect them to apply either to males or females, and the agreement in Russian to be either masculine or feminine. But which gender would Google Translate pick in each case?

What this experiment, conducted by David Pesetsky of MIT and illustrated in the image above, reveals is that using statistical methods rather than linguistics — which is what Google Translate does — uncovers some interesting sociological stereotypes. In each pair, the first sentence gets translated with a masculine verb, and the second sentence gets a feminine verb!For example, I worked in the factory is translated using a masculine verb, while I worked in the school shows up with a feminine verb. Similarly, I built a house appears in the masculine, whereas I sewed a dress in the feminine. Such gender stereotypes concern not just most typical occupations, but also preferred hobbies, typical  house chores, even food consumption patterns. The sentence I studied mathematics gets a masculine verb, whereas I studied art is translated with a feminine verb. Similarly, Google Translate views ‘cleaning the garage’ as a mostly male activity and ‘cleaning the house’ as a female activity. Female partners who own dogs may disagree that ‘wash the dog’ is something done predominantly by men, but would you expect Google Translate to choose a masculine verb when translating I washed the children? And can you guess which form appears in the Google Translate “translations” of I ate meat and I ate a salad?

If Google Translate’s choices of verb forms are to be taken seriously, men are workaholics, while women like to spend their time at more light-hearted activities; hence, I loved to work is translated in the masculine and I love to dance — in the feminine. Men have more gravitas: they ‘laugh’, while women ‘giggle’.

But perhaps most revealingly, the sentence I was stupid is translated with a masculine verb (and adjective), whereas I was ugly gets a feminine translation. Is it because men worry more about their smarts, and women — about their looks? If men and women I know are anything to go by, the answer is “hardly”. But then stereotypes don’t have to match reality too closely, do they?

 

 


Like this post? Please pass it on:


Previous Post
«
| Next Post
»

Related Posts

Subscribe For Updates

We would love to have you back on Languages Of The World in the future. If you would like to receive updates of our newest posts, feel free to do so using any of your favorite methods below:

      

  • Anonymous

    Not unexpected, considering it’s based solely on previous translations and writings, with no human input to make it more gender-neutral.

    Note, however, that even when creating rule-based systems, you often have to make some choice or another, and since masculine is more frequent in real-world text, that’s used as the default, because it has a higher chance of not being edited away in post-editing. Of course, you can try to use anaphora detection, and that some times helps, but when you give it a bunch of fake sentences as above, you’ve got nothing to go on. The difference with a rule-based system is that it would typically make all those sentences masculine (unless the creators tagged verbs or nouns with agent gender-bias). Translation ain’t easy, and without context, it’s pretty much pointless too.

    • http://www.pereltsvaig.com Asya Pereltsvaig

      Yes, this was my point precisely that GT reveals (not “creates”) gender stereotypes. In a context-less experiment like this, as you note correctly, it is impossible to settle on a correct gender, but I wonder how gender is handled is more realistic text, where a human interpreter will select the correct gender… But would GT?

      • David Pesetsky

        The answer is no. Enter on the left side of the screen:  ”I am a man.  My name is Vanya.  I have a hairy chest.  I studied art.”  In Russian, “I studied art” still comes out feminine.

        Actually it’s more fun than that.  Enter “When I was a boy, I studied art”, and now it comes out masculine.  

        So you’re probably thinking, “Aha, with a more natural sentence, GT gets it right at last!”.  

        Alas no.  Type “When I was a girl, I studied art” — and the verb “studied” is still masculine, which is definitely not what we want.  (No idea why, by the way.)

        • http://www.pereltsvaig.com Asya Pereltsvaig

          Thank you for continuing with the experiment, David, and for sharing the results. It is really funny at times what GT comes up with, isn’t it? BTW, did you see my other posts on other such “experiments”?

        • Ivan Derzhanski

          You don’t even have to put words at the beginning.  Try this:

          , I studied art.,I studied art.. I studied art..I studied art.

          The first three are feminine; the fourth is masculine.  But:

          . I studied fashion., I studied fashion.,I studied fashion..I studied fashion.

          Now only the first is feminine; all the rest are masculine.

          • Ivan Derzhanski

            Oops.  These were meant to be separate lines (no idea why they were reformatted).  So “. I studied art”; ”, I studied art”; ”,I studied art”; ”.I studied art”; and ditto with “fashion” instead of “art”.

          • http://www.pereltsvaig.com Asya Pereltsvaig

            This is most fascinating, Ivan! Thank you for sharing this!

        • Ivan Derzhanski

          Now “I studied art” is indeed feminine when translated into Russian (as well as Ukrainian, Belorussian and Croat), but “I have studied art” and “I had studied art” are masculine.  Contrariwise, “I washed the dog” is masculine, but “I have washed the dog” is feminine.

          Google Translate is bright enough to render “I am handsome” in the feminine and “I am beautiful” in the feminine.  Of course, “I am pretty” is translated as “Я довольно” (I hope Cyrillic doesn’t get garbled here).

          • http://www.pereltsvaig.com Asya Pereltsvaig

            Thank you for carrying on these experiments, Ivan! There’s just so much that doesn’t make any sense with these GT translations, that one wonders why on earth?…

            BTW, did you mean to say that “I am handsome” is translated in the masculine?

          • Ivan Derzhanski

            Er, yes, I must’ve meant that, but I shouldn’t have meant it.  Here goes: “I am handsome.” and “I am beautiful.” are both feminine (“Я красивая.”), whereas “I was handsome.” and “I was beautiful.” are differentiated by gender (“Я был красив.” and “Я была красивой.” respectively).

            I’m including the full stops at the end of the sentences, because they often make a difference, sometimes flipping the genders: “I am handsome” is “Я красивый” without a full stop, but “Я красивая.” with one.  Likewise, David’s example “I was ugly” is “Я был уродливый” without a full stop, but “Я была некрасивой.” with one.

            This is what I call pseudo-random behaviour.

          • http://www.pereltsvaig.com Asya Pereltsvaig

            I wonder if the GT people are even aware of these issues…

          • http://www.pereltsvaig.com Asya Pereltsvaig

            I wonder if the GT people are even aware of these issues…

          • Ivan Derzhanski
          • http://www.pereltsvaig.com Asya Pereltsvaig

            This is amazing!

  • http://blog.terminologiaetc.it/ Licia

    I was intrigued by this experiment and I was curious to see if similar stereotypes might be revealed also for Italian, but both Google Translate and Bing Translator only seemed to produce ’masculine’ results.
    In Italian, only the participle of intransitive verbs agrees with the gender of the subject, so I had to tweak most of the sample sentences, choosing examples where the Italian translations might return past participles, but even “I have been busy washing the children” and “I went to the beauty salon” produced ‘masculine’ translations.
    I eventually found a sentence where GT (but not Bing) returned a feminine verb, “I went to the gynaecologist”, “sono andata dal ginecologo” (if GT reflected real-life trends, in a few years’ time also the doctor’s gender would be feminine).

    • http://www.pereltsvaig.com Asya Pereltsvaig

      This is most fascinating, thank you for sharing it! I wonder why the difference between Russian translations (revealing stereotypes) and Italian translations (seemingly all with masculine default)? Any ideas?

    • Ivan Derzhanski

      Actually, these three sentences (“I have been busy washing the children”, “I went to the beauty salon”, “I went to the gynaecologist”) are all translated in the masculine into Russian.  (The first one is translated incorrectly, as lit. `I have been busy with the children’s washing [machine]‘, but never mind that.)  Perhaps they are too long for the tell-gender vocabulary at the end to have an effect on the verb.  On the other hand, in Polish the distribution of the genders comes out the same as in Italian.

      • http://www.pereltsvaig.com Asya Pereltsvaig

        I wonder why the difference between Russian and Polish?

        • Ivan Derzhanski

          Google Translate has no knowledge of any language, no knowledge of the world either, all it has is those enormous collections of supposedly parallel texts.  So its behaviour in any situation depends on what happens to occur in the texts, which may be anything.  Suppose it has “Sono andata dal ginecologo” and “Poszłam do ginekologa” readily available in its corpora, whereas the corresponding Russian sentence doesn’t appear there and has to be constructed (and in such a situation the masculine ends up being used as default simply because it’s more frequent in the corpus).

          What I’d like to know is why “I washed the dog.” (with a full stop) is Google-translated into Bulgarian as “Измих кучена.” when there is no such word form in the language as “кучена”.  You’d expect statistical machine translation to use words that are frequent; here it produces a word that doesn’t even exist!

          • http://www.pereltsvaig.com Asya Pereltsvaig

            Ah, the mysteries of Google Translate! The minute you think you have a handle on its inner workings, something absolutely ridiculous like the punctuation errors you’ve pointed out or the use of a non-existent form pops up and defies all expectations!

            Besides, I find it hard to believe that Russian texts would not have “I went to a gynecologist”!!!

          • Ivan Derzhanski

            Truth be told, I don’t find it very easy to believe either (even with the reservation that we’re talking of Russian texts with English parallels [in Google's possession]), but that’s the only explanation that I can come up with. Otherwise GT is a black box.

            Punctuation plays a part here as well: “I went to the gynaecologist” is “Я пошел к гинекологу” without a full stop and “Я пошел к врачу-гинекологу.” with one.  (So the full stop tells Google to make the sentence as complete as possible.)

            But if you want to make the verb feminine, there is a way.  Simply put a comma (or one of several other punctuation marks) at the beginning of the sentence.  ”*I went to the gynaecologist” is “*Я пошла к гинекологу”.

          • http://www.pereltsvaig.com Asya Pereltsvaig

            Thank you for sharing these examples. They only support my original point that Google Translate relies on statistical matches that may or may not make any sense to a human translator… Like you say, GT is a black box.