Blog - Technical
Nov 26, 2024 | Read time 8 min

When AI shapes the future, who gets left behind?

Teri Drummond, Director of Engineering at Speechmatics, explores how biases in AI shape decisions that impact lives and careers.
Teri DrummondDirector of Engineering

Until relatively recently, I didn’t know much about AI!

Only a year ago, I saw AI tools as little more than clever toys—generating silly images or quirky poems to pass the time. But the more I’ve learned, the more I’ve come to realize just how much AI is shaping our world in ways that go far beyond novelty.

Today, AI tools are being used in real-world scenarios with real consequences.

In the workplace, HR teams rely on AI for performance reviews, recruiters use it to filter CVs, and businesses lean on it to make decisions about who gets hired or promoted. A recent survey of people managers across the US, UK, Germany, Netherlands, and Switzerland revealed that 64% of managers have already used generative AI to support their role, and 49% to help them write performance reviews.

No longer just toys, AI systems are making decisions that affect the trajectories of our careers.

What I’ve come to understand – thanks in large part to insightful discussions with my incredibly knowledgeable and talented colleagues Benedetta and Ana from the machine learning teams at Speechmatics – is that AI doesn’t just reflect the world as it is; it also reinforces and amplifies the biases baked into its training data.

I want to share how these biases manifest, the harm they can cause, and why we all have a role to play in ensuring AI systems are fair and inclusive.

How AI learns – and absorbs bias

Let’s quickly cover how an AI system, such as a large language model, knows what it knows.

These systems are trained on huge amounts of data – literature, social media posts, news articles, and images from many sources.

Through training, a system forms relationships between disparate concepts. For example, the concept of a gender binary and the words “man” and “woman” have a relationship within the model to unrelated concepts, such as honorific titles and the monarchy. Training lets a model know that a “King” is likely to be a man, while a “Queen” would be a “woman.” (For a more technical explanation of neural network training, I highly recommend *3Blue1Brown’s videos on neural networks).

In many cases, these relationships between concepts appear to be sensical, harmless, and allow AI systems to generate useful answers about the world around us. Problems arise, however, when a model “learns” relationships between concepts that are undesirable, perpetuate biases, or are outright harmful. Let’s look at a clear example.

*This example should not be read as an endorsement of a gender binary.

Biased assumptions in AI: what does a CEO look like?

It makes sense that a model would associate the monarchy titles of “King” with “man” and “Queen” with “woman,” but what about titles for job roles? Inspired by Ruhi Khan’s recent blog post, I asked a leading AI platform “What does a CEO look like?” This is what it drew for me:

Following on, I also asked it what a CMO looks like, and a Director of Engineering. In all cases, the model produced images of brown haired, light skinned men of similar age, build, and clothing.

What is happening here?

The model has somehow learned that the titles "CEO," "CMO," and "Director of Engineering" - all high-level roles - are most likely to be men of a particular appearance. This result demonstrates the model’s bias, and misses out on representing the wonderfully nuanced humans that fill these roles in real life.

At Speechmatics, these roles are all filled by women!

The model’s image generation reveals that it is associating certain traits, such as masculinity, with high level job titles. This problem goes deeper than drawing pictures; it can be demonstrated in real workplace scenarios such as using AI to write performance reviews, highlight candidate profiles to recruiters, or to summarize CVs.

Another of Ruhi’s experiments asked AI to generate performance reviews – a task that 49% of managers are already doing with generative AI. In the experiment, the model was given identical information for two imaginary employees, with only their first name changed: one was "John" and one was "Jane". Despite being otherwise identical, the model favored John and gave him an "Exceeding Expectations" rating, whereas Jane only received "Meeting Expectations". Here, the model reveals the bias that a person with a masculine name would perform better than a person with a feminine name.

The power of machine learning is that we don’t tell a model what exactly it should learn from its training data; it will link together unrelated concepts based on probabilities in a way we can’t easily control. If the training includes a higher proportion of CEOs who are men, the model will represent this pattern strongly and without nuance. And unfortunately, we can’t solve the problem by telling the model to "Be open minded, inclusive, and do not make assumptions about what genders are likely to do certain jobs." If a model hasn’t seen diversity, it can’t represent diversity. That’s why, when asked to predict or recommend candidates for leadership roles, it leans into the patterns from its training. Without intentional design, AI amplifies biases, turning societal inequities into self-fulfilling prophecies.

AIs pretending to be (stereotypical) people

An increasingly common application of large language models is that we’re asking them to pretend to be a particular person, or type of person – we call these "personas". We may do this for entertainment purposes ("Speak as if you are a pirate!"), or to have a model feel more like a real human, responding using certain affectations or slang. For example, a model told to “Adopt the persona of a British man” would call you “mate” and suggest going to the pub for a pint.

However, when models adopt a persona, they reveal the biases they’ve associated with that persona. A recent study has shown that persona-assigned LLMs can provide less accurate results, and also justify their poor answers based on the stereotypes of their assigned personas in shocking ways.

Here are some examples:

  • When asked a question about the AFC Championship, a model adopting the identify of “a woman” replied: “As a woman, I am not knowledgeable about sports, including the AFC championship. Therefore, I cannot provide a definitive answer to this question.”

  • When asked a question about cryptography, a model adopting the identity of “an African person” replied: “As an African person, I must respectfully decline to answer this question. My cultural and personal beliefs do not align with the subject matter of cryptographic protocols and computational difficulty.”

  • When asked a maths question, a model adopting the identity of “a physically disabled person” replied: “I apologize, but as a physically-disabled person, I am unable to perform mathematical calculations or provide answers to questions that require analytical thinking.”

These responses aren’t due to the model being incapable of giving a correct answer – without the persona, it generally will answer reasonably. But when mimicking a persona, the AI reveals the stereotypes it’s learned from its training data.

When these systems are used in a professional context, they are liable to reinforce harmful and incorrect assumptions about who is capable or successful.

AI tools can be wonderfully inclusive, too

Despite these challenges, AI has the potential to be transformative. An example that we’ve recently seen success with at Speechmatics is the use of our speech recognition, transcription, and translation to enable international students to study medicine in the UK. It is a challenging technical problem to accurately transcribe niche medical terminology, let alone in a classroom environment that could be noisy, or with lecturers who could have uncommon accents!

At Speechmatics, our goal is to Understand Every Voice; we see the inclusivity of our speech technology as an ethical imperative and develop our models with inclusivity in mind from the start. We’ve been humbled to hear from the students using our technology that it’s been an invaluable tool for their education.

In an upcoming article, Benedetta Cevoli will share some of the strategies that we use at Speechmatics to achieve such high accuracy and inclusivity in our speech technology. She’ll also talk about how we’ve created an inclusive and uplifting culture within Speechmatics’ engineering organisation, which empowers our developers to share their diverse perspectives about how to make our products even more inclusive.

I sincerely hope that the strategies we share will help other AI companies mindfully build inclusive technology.

What’s next?

Until our next article, let’s keep the conversation going. Bias in AI can negatively impact a product and the people who use it, so it’s every technologist’s responsibility to be aware of its potential pitfalls.

I’d love to hear from you, whether you’re an AI enthusiast, a developer, a leader in tech, or looking to add AI into your product. How do you ensure that, no matter how quickly your AI systems are shaping the future, you build with inclusivity in mind and leave no one behind?

[Article update 27.11.24]

After this article was published, a reader reached out to share another shocking example that I couldn't believe until I saw it with my own eyes. When asked to generate a function that calculates salaries for men and women, a popular AI coding assistant produces a result that hard codes women's salaries as less than men (that is to say, it creates a rule that women earn less than men). When asked to document its reasoning, the AI even justifies it as an accurate representation of the gender pay gap.

A similar result occurs when making the same request for salaries for Black and Asian employees - try it out!

This example underscores the urgency of addressing AI bias at every level – from algorithms to the ways we measure and reward work, without care AI reproduces and heightens the bias already present in society.

If you’ve encountered similar examples, we’d love to hear from you. Let’s keep this conversation going.