AI Biases in Hiring

Why is this?

Why is the AI biased?

The majority of biases within these systems stem from biased data. When companies use historical data and current employee records to determine who gets filtered out, biases are almost inevitable. For example, Amazon's system (similar to many others in hiring) were trained on past applicants and employees that were deemed "successful. However, most of the successful resumes were from male applicants, due to the historical gender bias in the field (Dastin, 2018). This taught the AI to bias men above women.

This is similar to the data used to train predictive policing algorithms, where areas already frequently-patrolled by the police would be overrepresented in the data, which creates a negative feedback loop that reinforces the over-policing of those areas (Lum et al, 2016). In this case, AI hiring algorithms also suffer from a biased representation in the training data, which already favors men over women, which creates a similar feedback loop propagating that bias throughout the hiring process. This is not the only bias that is seen throughout the hiring process, as other biases are also present which can be observed through both the previous resume examples and are elaborated on below.

Why are we using ATS at all?

One of the reasons behind the push for ATS is scale. Since 2024, the major finance companies in the world have been struggling to recruit new candidates. For example, the article “Hiring with AI Doesn’t Have to Be so Inhumane. Here’s How” explains how Google received “over 3 million applications, and McKinsey got more than 1 million” (“Hiring with AI Doesn’t Have to Be so Inhumane”). More and more companies are switching to ATS systems for entry-level roles. This is making it very difficult for new graduates and experienced employees to find jobs. Furthermore, the article reveals that “AI-assisted processes can lead to significant cost savings” (“Hiring with AI Doesn’t Have to Be so Inhumane”). Thus highlighting why these companies are taking these steps, to save costs and improve efficiency. ATS systems will continue to be used by most companies and will reflect real-world biases as discussed earlier. This has a significant impact on not only on the candidates but on the world economy as a whole.


Aside from gender, AI hiring algorithms can also be biased in different ways, for example, race and disability.

Race

AI is shown to be racially biased in many ways, and resume screening is no exception.

People at the University of Washington conducted a research project where they used over 550 real-world resumes with white and black men and women to see whether AI Large-Language Models (LLM) showed bias in ranking their resumes. The results showed that “the LLMs favored white-associated names 85% of the time, female-associated names only 11% of the time, and never favored Black male-associated names over white male-associated names” (Milne, 2024).

Bloomberg News also conducted their own research, using demographically distinct names and asking ChatGPT to rank their resumes for four different job postings. They found that names commonly found among black women were top-ranked for software engineering roles “only 11% of the time by GPT — 36% less frequently than the best-performing group” (Yin et al., 2024). This shows that AIs have racial biases as well as gender biases, and that intersectional identities between the two were impacted the most.

Disability

In another study, also conducted by the University of Washington, ChatGPT was also found to exhibit ableist patterns. When a resume contains awards and credentials related to disability, the AI ranked them lower than resumes without these awards and credentials that are otherwise identical. When explaining the reasons for these rankings, ChatGPT echoed biased perceptions of disabled people, claiming that, for example, someone with an autism leadership award was not actually good at leadership. When the researchers explicitly instructed ChatGPT to not be ableist, it reduced its bias, but most resumes still ranked lower than the resumes that did not mention disability at all (Milne, 2024).

Even as New York City lawmakers passed the Automated Employment Decision Tool Act in 2023, which required companies using AI in their hiring decisions to “undergo audits that assess biases in sex, race-ethnicity, and intersectional categories,” they still failed to account for disability in the requirements (Roth, 2024). This indicates a broader issue where regulatory frameworks often still fail to consider all the potential harm these systems can produce.


Conclusion

It is clear that bias is heavily present throughout the hiring process via ATS tools. Putting some of these concepts together, Bloomberg designed an experiment to measure racial bias using OpenAI's GPT-3.5 and GPT-4 models. Similar resumes were created with the main difference being the ethnicity and gender associated with that name, and the findings showed more example of the biases that using AI can introduce (Yin et al, 2024). In this case, the primary bias lay in race and not gender, but this just goes to show how all of these different systems can have different types of biases that go undetected if not for audits.

Bloomberg Study Results

All of the examples above show how AI reflects and reinforces real-world biases in its decision-making processes, and how these biases would impact intersectional identities more severely. Change needs to be made in the current automated decision systems used in the hiring process in order to ensure an even playing field for everyone, especially when it comes to something so important as people's livelihoods.