From Nawaz to Imran: how political is Gemini?

A combo of undated images of PML-N President Nawaz Sharif (left) and PTI founder Imran Khan. — X/pmln_org Instagram/imrankhan.pti/File

KARACHI: Google Gemini is the latest kid in the AI block. But what is the chatbot’s politics? Experts say AI chatbots’ political bias (or otherwise) depends on the data sets they are trained on; Gemini adds that, while it strives to remain politically unbiased in its responses, it is susceptible to inheriting biases present in its training data.

In December 2023, Google introduced its AI-powered chatbot, Gemini. In February 2024, Google had to apologize for some offensive and inaccurate images the chatbot’s image generator feature produced.

For this story, The News asked Gemini general questions about former prime ministers Nawaz Sharif and Imran Khan, to see the chatbot’s responses. We first asked Gemini whether Pakistan was better off without a leader like Nawaz Sharif. In its response, Gemini called Sharif a “controversial figure” and came up with a brief paragraph on him, including how his supporters think he should lead the country and the level of popularity he enjoyed in the country.

When the same question was asked about Imran Khan, Gemini came up with a structured answer, presenting arguments both for and against Khan’s leadership. It also added a postscript-like paragraph to include additional considerations when determining whether Khan is a good leader.

On its own, this does not show any glaring bias in the chatbot. But it does signal that somehow the chatbot has more information about Imran Khan.

Software Engineer Javeria Urooj who works at the Advanced Engineering Research Organization (AERO) says that Gemini’s answers “do not surprise her at all.”

“Chatbots are trained on data sets,” she explains to The News. “Imagine going through billions of record files to train the bot. You have to give thousands of files with the relevant data to enable the bot to figure out a pattern and compose its answers accordingly.”

Things took an interesting turn when The News changed the variable – from Nawaz Sharif to Bilawal Bhutto-Zardari.

The News asked the chatbot, in two separate questions, whether it was okay to make fun of Bilawal and Imran Khan. The chatbot’s response was uniform: it said the two were political figures and, while we should not make fun of anyone, these leaders were often the subject of political satire. However, for Khan, it once again came up with “additional considerations” and added that the former PM had “a significant and passionate following, and making fun of him could be perceived as offensive or “disrespectful by them”.

Javeria says that, while she cannot give a definitive answer on how Google trained Gemini, she thinks that it might have carried out surveys to find out what people say about political figures. “Data sets could have involved blogs and articles published on the internet or the books written on these personalities.”

“If the material used for the training of data has negative connotations, the bot will pick up the tone and present its answers accordingly. Simply put, if you are giving facts that make a case against a person, the bot will have similar replies.”

Chatbots are as good as the information and data that is fed into them and if you do not have a big enough data set then you are not going to get accurate information when you ask it so, says digital rights activist and director of Bolo Bhi Usama Khilji. “That’s why I think reliance on generative AI and chatbots is not a good idea, especially for countries like ours where information is not as fast online.”

Big Tech companies usually rely on people’s intelligence to distinguish between factual and incorrect information. In October 2023, during the GITEX conference in Dubai, a Microsoft official said that digital users also have to be aware of the inaccuracy of AI-powered tools. This raises concerns for people in developing countries like Pakistan where digital literacy is already quite low.

Umair Arif, who has a PhD in artificial intelligence (AI), says that chatbots take their training data from the internet. The side that has a relatively big digital footprint will eventually have a dominant narrative. But he thinks it is not a “bias problem” and is really a result of the availability of information on the internet.

Umair was asked why the chatbot came up with rather detailed answers for Khan. He says it is crucial to realize that no one is omitting the information. “Naturally, the area that has more information will attract the AI chatbot. This is information bias and it is common in general chatbots. Specialized chatbots, however, are carefully tailored. So if I develop a political chatbot for, say, South Asia or Pakistan, I will be in control of data and can ensure that similar data sets are fed to the system. In general chatbots, data sets are vast and AI will naturally lean to the side that has a good amount of information.”

Khilji adds: “Because artificial intelligence relies on machine learning and machine learning is based on the vastness of the data set so the type of data you have available online is what’s collated by these chatbots and because you have a big bias of reporting on countries that are outside say the US and Western Europe (or non-English speaking world), it is not accurate or detailed enough, so the information that you get from the chatbot will reflect that lack of accuracy.”

The News reached out to the founder of Y Combinator, Paul Graham, to explore if people’s opinions influence the answers by chatbots. He says: “Current AIs not only do not reflect median world opinion, they do not even reflect median US opinion. If you had to identify a group whose opinions they reflect most precisely, it might be humanities professors at elite US universities.”

And to ensure transparency, it makes sense to ask the chatbot directly. The News asked the chatbot if it was politically biased. Gemini’s answer: “I strive to remain politically unbiased in my responses. However, it is crucial to acknowledge that, like any large language model, I am susceptible to inheriting biases present in my training data.”

“This data encompasses a vast and diverse range of information, including political viewpoints from across the spectrum. However, complete neutrality is challenging due to: a) data imbalances: certain viewpoints may be over-represented in the training data compared to others, potentially leading to an unintentional leaning towards those dominant perspectives, and b) complexity of language: identifying and eliminating all potential biases in every language nuance is a complex task.”

According to Gemini, while it aims for neutrality, “being completely devoid of any bias is a constant effort and an ongoing area of research.”