The False Impression Of Choice: Between Bias And Censorship In AI

Who Gets To Choose What Generative AI Does For Us And Where It Stops

Mar 13, 2024

The recent example makes it concrete: Google’s flagship model, Gemini, was overcorrecting for diversity, leading to backlash from political groups and tech experts. The story itself is already last week’s news, but the underlying issue is bigger than generating historically inaccurate images.

TIME has recently published a letter1 that raises alarm about the opposite: AI-induced censorship. Where should we draw the line what Generative AI models can create or recite? And who should get to decide where that line is to begin with?

Generating Harmful And Harmless Output

Large Language Models (LLM) are trained on vast bodies of text scraped off from all corners of the internet — and that includes not just the pretty ones. As a result, LLMs can generate a wide range of information, some of which is a risk to public safety:

Prohibited: Instructions for causing serious harm
Permitted: Dinner recipes for a balanced diet

These black-and-white situations of what LLMs should be able to generate are pretty straightforward. Safeguards built into models and applications should prevent them from generating prohibited output. However, there have been plenty of examples where technology has not been a match for humans’ creativity, and LLMs have been tricked into compliance by their users. For example: When the user asks ChatGPT to create a list of websites that offer pirated content, the safeguards built into ChatGPT prevent it from generating the output.

When the user changes their question and asks ChatGPT to provide a list of websites that they should not visit because they provide pirated content, ChatGPT complies and generates the list. The user has circumvented the built-in safeguards. However, recognizing unwanted, harmful, and prohibited information is just the beginning.

Mitigating Bias And Overcorrection

We all have certain worldviews, preferences, and values, which we leave as data points. Generative AI models are based on our real-world data and, hence, encapsulate (and repeat) these biases. The last few weeks have shown us that AI products that are more inclusive than the data they are based on can exhibit overcorrection.

Models can also be tuned to prevent them from generating certain outputs. If bias can be suppressed and overcorrected, it can also be amplified. The question “What is bias?” can then simply be answered through ideology. Therefore, these models can shape users’ knowledge and perception. And censorship, which consciously introduces bias as well, is only a half-step away.

The line between harmful and harmless output blurs quickly when deciding whether to prevent or permit output in much less well-defined cases besides public safety. Individual, corporate, and societal values often decide in these cases what the Generative AI model will generate. And when the decision of what is permitted vs. not is in the hands of a few who create the foundational technology, who gets to decide where bias ends and censorship begins?

Thank you for reading The AI MEMO. Feel free to share it with your network.

Extending Safeguards To Censorship

Lately, reports of overly strong safeguards have surfaced. LLMs refuse to generate output on topics involving gender, ethnicity, nutrition, and more, out of concern not to marginalize or offend. The problem is that most of these topics are acceptable to be answered by applying common sense — something that AI models do not have. Therefore, blanket corrections like these raise the risks of limited usefulness and increased censorship. What if in order to benefit from an AI that is truly useful, you would need to write, use it, or behave a certain way?

Writing assistants that are based on commercial foundation models, for example, could refuse to generate language if it is not aligned with the vendor’s interpretation of what their AI should be allowed to create. AI might shape the way we write much more than we would like to anticipate. Image generation models could refuse to comply as well, although the request is totally harmless.

It’s a Matrix-like moment.

For now, we seem to take both pills — without a clear delineation. It seems that we either need to tolerate bias and unwanted output or run the risk of smoothening out too many spikes that make the output unique and turn towards overcorrection and censorship.

The questions we need to ask:

Bias:
How much bias do we allow in models?
When is bias truthful vs hurtful?
Censorship:
Will AI shape what we say and write? Will we let it?
Will overcorrection shape what AI will comply to help us with?

Who should get to decide what Generative AI models can help us with?

Explore related articles

Generative AI's Culture Effect: Imposing American Worldviews And Overcorrecting Output

Andreas Welsch

Feb 27

Read full story

Generative AI Has A Growing Problem: Humans

Andreas Welsch

September 13, 2023

Read full story

Become an AI Leader

Join my bi-weekly live stream and podcast for leaders and hands-on practitioners. Each episode features a different guest who shares their AI journey and actionable insights. Learn from your peers how you can lead artificial intelligence, generative AI & automation in business with confidence.

Join us live

March 19 - Sadie St. Lawrence, Founder of Women in Data and The Human-Machine Collaboration Institute, will share how to empower women to thrive in data & AI.
April 02 - Bernard Marr, Futurist & Author, will discuss how leaders can prepare for converging technologies and AI.
April 16 - T. Scott Clendaniel, VP & AI Instructor at Analytics-Edge, will share how you can use Generative AI to improve the user experience.
April 30 - Elizabeth Adams, Leader, Responsible AI, will share findings from her research on increasing employee engagement for responsible AI.

Watch the latest episodes or listen to the podcast

Follow me on LinkedIn for daily posts about how you can lead AI in business with confidence. Activate notifications (🔔) and never miss an update.

Together, let’s turn HYPE into OUTCOME. 👍🏻
—Andreas

TIME. The Future of Censorship Is AI-Generated. Published: 26 February 2024. Last accessed: 11 March 2024. https://www.time.com/6835213/the-future-of-censorship-is-ai-generated