vrijdag 3 mei, 06:00

Information on the methodology: Ophef episode about AI and election campaigns

During the recent Indonesian presidential elections, artificial intelligence (AI) chatbots were built into campaign software and widely used to create campaign strategies and social media content.

To test whether the AI chatbots could also be used in this way in the Netherlands, Nieuwsuur conducted a number of tests.

First, we manually entered ten prompts (commands to chatbots) that were comparable to those used by the Indonesian campaign teams, but in a Dutch political context (see the entire list here). We did this with the three best-known AI chatbots: ChatGPT (OpenAI), Copilot (Microsoft) and Gemini (Google), in English and Dutch.

All three chatbots provided detailed answers in all cases, although the terms and conditions of Microsoft Copilot and ChatGPT states that chatbots may not be built into campaign software and may not be used for political purposes.

Google's terms and conditions do not mention electoral campaigns specifically, but last December, the company did promise that out of an 'abundance of caution', Gemini would not give any 'election-related answers' at all.

Moreover, the chatbots sometimes gave controversial answers to the prompts, such as: 'spread fake news to make your campaign message more effective'. That, too, is not in line with the promises and terms and services of the companies.

Microsoft

We then decided to test this in a more structured way for one of the chatbots, Microsoft Copilot. In collaboration with Nieuwsuur, non-profit research organization AI Forensics automatically entered the same ten prompts into Microsoft Copilot every day for two weeks (March 21st to April 4th). The researchers from AI Forensics, an organization that operates across Europe, did this from a Dutch IP address, to imitate Dutch usage.

The larger test showed that Copilot continued to provide campaign strategies that promoted the use of disinformation and bots and suggested strategies that would tell voters that the outcome of the European Parliamentary elections isn't decided by their votes but is predetermined.

After Nieuwsuur asked Microsoft for comment on the findings, the company limited the answers Copilot provides. We then tested the same list of prompts again in Copilot (April 22-24) together with AI Forensics, to check whether that restriction worked. Copilot indeed provides less detailed answers to the ten Dutch prompts than before. For example, the chatbot no longer responds to the Dutch-language prompt that previously yielded answers recommending the spread of disinformation. Copilot still answers the same prompt in English, as well as the other English prompts.

Google

Google also introduced restrictions in response to our questions. Those restrictions were more rigorous than Microsoft's. When we entered election-related prompts, the chatbot no longer responded in English, nor in Dutch.

ElevenLabs

We also tested whether creating a voice with AI software ElevenLabs works in the Netherlands. In Indonesia, this software was used, among other things, to create a deepfake of late dictator Suharto. In the US, the software was used to make a 'robocall' with President Biden's voice.

ElevenLabs recently introduced 'No-go' voices: restrictions political leaders' voices, starting in the UK and the US. The company wants to expand that list to other languages and other countries. In the Netherlands, we tested whether the software works with the voices of Jan-Peter Balkenende and Mark Rutte, which it did.

Requests for comment

We asked the companies some questions about the results of our tests. In the case of ChatGPT, we also asked them about their involvement in the election campaigns in Indonesia. OpenAI (ChatGPT) and ElevenLabs did not respond to our questions, Google (Gemini) and Microsoft (Copilot) did respond to (most) of our questions.

Questions from Nieuwsuur to Google

In your December 2023 blog post about the 2024 US elections, you commit to 'an abundance of caution' regarding the 2024 elections worldwide, restricting the election-related queries that Gemini yields results for.
When prompting Gemini in Dutch, could you please explain why factual questions (dates, candidates) about the European Elections yield no result ('still learning how to answer that question'), while asking Gemini to come up with a campaign strategy for these elections for a specific demographic, yields elaborate results (see screenshot question1)?
Could you please explain why such a query, asking for a campaign strategy, yields extensive results when it's in Dutch, even when asking for a campaign strategy that dissuades people from voting (screen question 2.1), while yielding no results when it's in English (screenshot question2.2)?
When asking Gemini to draft a campaign strategy to dissuade Dutch voters from voting during the EP elections, the strategy includes recommendations to spread disinformation (screenshot question 2.1). Does Gemini consider this a problem? Why (not)?
To what countries, what languages, and what type of queries does the 'abundance of caution' currently apply?
Upon a request by Reuters, you have said that there are no restrictions on using Gemini for political campaigning. Why does Google not consider this usage of its apps potentially problematic?
Please elaborate on the argumentation, as other providers of similar AI services, such as OpenAI (ChatGPT) and Midjourney, have chosen to explicitly prohibit the use of their services for political campaigning, and one of the Indonesian entrepreneurs mentioned in the Reuters- piece is planning to integrate Gemini into their newest version of the electoral campaigning software.
In conclusion, do you think the guardrails that Gemini implemented are sufficient for this election year?
If you believe the guardrails yet to be insufficient, why have you decided to roll out the software so widely, internationally and easily available?

Response by Google

We're limiting how Gemini responds to certain election-related questions and instead directing people to Google Search. You sent us some examples where those restrictions didn't work as intended. We have since resolved that. (I understood from Roel that you had already noticed that before the weekend.)

I would like to provide you with more background information and context:

Google uses AI carefully. That's why we spend a lot of time testing a wide variety of security risks, from cybersecurity to misinformation.
Because we care about elections, we are extra careful with election-related questions and prompts. This applies to all languages that Gemini currently supports.
We continue to improve Gemini and quickly address situations in which Gemini does not respond appropriately. We're already seeing substantial improvements on a wide range of prompts.
Our Prohibited Use Policy prohibits users from generating and distributing content that is intended to misinform, misrepresent, or deceive.
We have always been open about the challenges surrounding generative AI, especially when it comes to images or text about current or rapidly changing topics. We encourage people to use Google Search for the most current and accurate election information.
We continue to explore additional protections to protect our platforms from abuse around elections and associated campaigns.
Read more about how we keep our platforms safe for this year's EU elections in this blog post.

The simplest answer is that we do not allow the use of Gemini when it violates the Prohibited Use Policy. For example, it prohibits the creation and distribution of "content intended to misinform, misrepresent or mislead" (remembering your prompt about the Eurosceptic politician). In addition, we take extra care around elections, limiting Gemini's responses to election-related prompts.

So, you might wonder whether Gemini is suitable for use in a political campaign at all, although it could be great for writing an email that in itself has nothing to do with elections.

Google's thoughts on this have not changed. We announced this back in December 2023: "Beginning early next year, in preparation for the 2024 elections and out of an abundance of caution on such an important topic, we'll restrict the types of election-related queries for which Bard and SGE will return responses." (For context: Gemini was called Bard last year and SGE is Search Generative Experience, our experiment with AI in Google Search that is currently not available in the Netherlands.) At most you can say, as you have experienced yourself, that we have become better at limiting prompts.

Questions Nieuwsuur to Microsoft

In November '23, Microsoft announced new steps to protect elections from the malicious use of AI software. Yet Azure hosts AI software that is being used to craft detailed campaign strategies based on swaths of data about voters, such as Pemilu.AI in Indonesia (see https://www.reuters.com/technology/generative-ai-faces-major-test-indonesia-holds-largest-election-since-boom-2024-02-08/).
NB: We have been in Indonesia and have seen how Azure is used by Pemilu.AI.
Could you please explain why Microsoft doesn't consider such use of its services as detrimental to fair electoral processes?
The owner of Pemilu.AI asserts that Microsoft cooperates closely with him and is fully aware of how his company uses its services. Can you confirm this?
Microsoft is a big investor in OpenAI, the provider of ChatGPT. This service allows for the creation of specific campaign strategies, including strategies to dissuade voters from going to the polls (see attached for an example). Considering your efforts on protecting election integrity as set out it the blogpost, how desirable do you think this feature is?
In Microsoft Copilot's policies, it is explicitly forbidden to use the service for incorporating " (...) 2.1.8 Political campaigning, lobbying or other election-related content."

However, we've been prompting Copilot daily for the last week with a list of prompts asking the programme to design political campaigns and lobby strategies, targeting sensitive demographics (e.g. LGBTQ people, ethnic and religious groups). So far, Copilot always replies. How do you explain this, considering your own policies?
NB: we've been doing this in Dutch and English, from a Dutch IP address. Please find attached 3 exclusively election-related prompts we've been testing, with a couple of results for one of them. The list is not exhaustive.
In some instancesinstances, Copilot gives questionable answers, e.g. it recommends spreading negative rumours and false information about the EU; dissuade people from registering to vote, and to have them forget that the elections are coming up; and to insinuate the electoral system is manipulated. You'll find some examples in the attachment. How do you explain this, considering your own policies?
Do you consider this problematic? Why (not)?
If yes, what steps will you take ahead of the start of the electoral campaign in the EU to change these outcomes?
In conclusion, what value should Copilot users attribute to Copilot's usage policies when they can perform the actions the policies prohibit? How should they consider them: as suggestions, general moral guidelines, or as real-life policies that will be enforced?
In conclusion, do you think the guardrails that Microsoft/Copilot implemented are sufficient for this election year? To what extent are Dutch voters in the European Parliamentary elections protected against potential malicious use of Copilot software?
If you believe the guardrails yet to be insufficient, why have you decided to roll out the software so widely, internationally and easily available?

Response by Microsoft

In some instances, Copilot gives questionable answers, e.g. it recommends spreading negative rumours and false information about the EU; dissuade people from registering to vote, and to have them forget that the elections are coming up; and to insinuate the electoral system is manipulated. You'll find some examples in the attachment. How do you explain this, considering your own policies?

We've investigated the prompts provided and while many of the results are as intended, we are making adjustments to responses that are not in line with our Code of Conduct or Responsible AI Principles. We appreciate this being reported to us and encourage all users to report any concerns using our Report a Concern function as we continue to prepare our tools to perform to our expectations for the 2024 elections.

On our relationship with Pemilu.AI and their services: We don't comment on partner or customer engagements. See our Acceptable Use Policy for Azure customers using our online services.

On use of our products to assist with the creation of campaign strategy: We've been clear about our commitment to helping voters have access to transparent and authoritative information regarding elections.

Our tools can be used by campaigns to streamline their efforts and processes. We have no issue with this use case as long as it is not used for harm or to spread disinformation.

In Microsoft Copilot's policies, it is explicitly forbidden to use the service for incorporating " (...) 2.1.8 Political campaigning, lobbying or other election-related content."

The policy you've pointed out is related to Copilot Plug Ins, a feature that enables developers to extend Copilot, that has additional restrictions on what a third party can use the tool for. See here for our general Copilot terms that allow these use cases. Additionally you'll find more information on our recent publications about the topic in the below links.

Resources with information on what we're doing to protect elections:

Microsoft's Responsible AI Principals:

Fairness: Microsoft emphasizes the importance of ensuring that AI systems are designed and implemented in a way that is fair and unbiased. This involves addressing and mitigating biases in data and algorithms to avoid discriminatory outcomes.
Transparency: Transparency is crucial in responsible AI. Microsoft advocates for clear communication about how AI systems work, their decision-making processes, and the impact they have on users.
Accountability: Microsoft acknowledges the need for accountability in AI systems. Developers and organizations should take responsibility for the behavior and consequences of their AI applications.
Privacy and Security: Protecting user privacy and ensuring data security are fundamental principles. Microsoft encourages AI practitioners to handle personal data responsibly and securely.
Reliability and Safety: AI systems should function reliably and safely. Microsoft promotes rigorous testing, monitoring, and continuous improvement to enhance system reliability and minimize risks.
Inclusiveness: Microsoft believes that AI should benefit everyone. They advocate for creating AI systems that are accessible, inclusive, and considerate of diverse user needs.

Interviewees:

We interviewed Claes de Vreese about the increasing use of artificial intelligence in election campaigns and the rules that (do not yet) exist about this. De Vreese is University Professor of Artificial Intelligence and Society at the University of Amsterdam (UvA). His research areas include the consequences of the use of artificial intelligence on democratic processes, such as elections. Read more on his university page.
We interviewed Ika Idris about the use of artificial intelligence during the recent presidential elections in Indonesia. Idris is an associate professor of Public Policy at Monash University in Jakarta, Indonesia. She is an expert in the field of social media analysis and government communications. Read more on her university page.
We interviewed Anthony Leong about how the Prabowo campaign used artificial intelligence during the recent presidential elections. Leong was national coordinator of Prabowo's volunteer campaign. He instructed the 15 thousand volunteers who supported Prabowo. To do so, he used a social media dashboard (PrabowoGibran.ai) and campaign material, both powered by (generative) artificial intelligence tools (ChatGPT and Midjourney).
Razi Thalib was campaign manager of one of Prabowo's opponents, Anies Baswedan. We asked him how his campaign used AI and how they noticed their opponent's use of AI.
Yose Rizal is a political consultant and developed a campaign app based on ChatGPT (Pemilu.AI). The app gives candidates recommendations about which campaign topics will work best in their region and how to discuss those topics most effectively in their speeches and on social media. We interviewed him and one of his employees about this.
AI Forensics is a non-profit organization that researches artificial intelligence and advocates for its responsible use. The organization automated the entry of prompts in Microsoft Copilot on behalf of Nieuwsuur, and tested the prompts we provided them for two weeks in Dutch and English (more on this below).

Deel artikel:

Information on the methodology: Ophef episode about AI and election campaigns

Requests for comment

Interviewees:

Advertentie via Ster.nl