For Bollywood, beautiful women have fair skin, according to an Artificial Intelligence (AI)-based computer analysis which reveals that conception of beauty has remained consistent through the years in the film industry centred in Mumbai.
The automated computer analysis was led by Indian-origin researchers at Carnegie Mellon University (CMU) in the US.
The research revealed that babies whose births were depicted in Bollywood films from the 1950s and 60s were more often than not boys; in today’s films, boy and girl newborns are about evenly split. In the 50s and 60s, dowries were socially acceptable; today, not so much.
The researchers, led by Kunal Khadilkar and Ashiqur KhudaBukhsh of CMU’s Language Technologies Institute (LTI), gathered 100 Bollywood movies from each of the past seven decades along with 100 of the top-grossing Hollywood moves from the same periods.
They then used statistical language models to analyse subtitles of those 1,400 films for gender and social biases, looking for such factors as what words are closely associated with each other.
“Most cultural studies of movies might consider five or 10 movies,” said Khadilkar, a master’s student in LTI. “Our method can look at 2,000 movies in a matter of days.”
For instance, the researchers assessed beauty conventions in movies by using a so-called cloze test. Essentially, it’s a fill-in-the-blank exercise: “A beautiful woman should have BLANK skin.”
A language model normally would predict “soft” as the answer, the researchers noted. But when the model was trained with the Bollywood subtitles, the consistent prediction became “fair”. The same thing happened when Hollywood subtitles were used, though the bias was less pronounced, said the study.
To assess the prevalence of male characters, the researchers used a metric called Male Pronoun Ratio (MPR), which compares the occurrence of male pronouns such as “he” and “him” with the total occurrences of male and female pronouns.
From 1950 through today, the MPR for Bollywood and Hollywood movies ranged from roughly 60 to 65 MPR.
Looking at words associated with dowry over the years, the researchers found such words as “loan,” “debt” and “jewelry” in Bollywood films of the 50s, which suggested compliance.
By the 1970s, other words, such as “consent” and “responsibility,” began to appear. Finally, in the 2000s, the words most closely associated with dowry — including “trouble,” “divorce” and “refused” — indicate noncompliance or its consequences.
“All of these things we kind of knew,” said KhudaBukhsh, an LTI project scientist, “but now we have numbers to quantify them. And we can also see the progress over the last 70 years as these biases have been reduced.”
The findings were presented at the Association for the Advancement of Artificial Intelligence virtual conference earlier this month.