(First published: January 29th, 2020. Updated: July 26th, 2023)
Implementing Artificial Intelligence (AI) in the workplace offers interesting opportunities to increase results and impact for various stakeholders. However, it also introduces ethical challenges. I find HR practitioners still lagging in their understanding of this domain, though their role in this field, as I see it, is crucial. Therefore, I dedicated a significant portion of my talks and training programs in recent years to close this gap (without Math and Coding, so don’t worry!). In particular, I discussed concepts and topics that, in my opinion, enable a better consideration of AI solutions in the workplace in a more informed way.
I don’t think anyone can describe in a single lecture the entire ways that AI will affect the realm of work and organizations. But I do believe that every stage of the employee lifecycle will be positively affected by AI. And we will also encounter challenges, both technical and social. To face those difficulties, I call every HR leader to start by understanding these five themes: 1) What AI is – or isn’t? 2) How accurate is AI? 3) Why is AI prone to bias? 4) How do people react to AI? 5) How do legal frameworks deal with AI? In this part, I discuss the first three themes. In part 2 of this article, I continue the conversation with the remaining themes.
What AI is – or isn’t?
If I ask you to close your eyes and imagine Artificial Intelligence (really, try that for a second), you’ll probably come up with a robot representation. Maybe you’ll imagine a smart machine that interacts with the environment and learn from its experience. Maybe your imagination will lead you to think about a futuristic robot-buddy, that can offer help without your specific instructions, and that can use logical reasoning and visual perception to be a substitute for a human interlocutor. Indeed, these are all nice associations. However, they are far from what AI is, in the context of HR-tech solutions that are offered today.
Most HR-tech solutions are applications related to the field of Machine Learning (ML). ML refers to a collection of algorithms that learn from specific datasets and then classify or predict outcomes. So, as our imagination navigates to ideas about AI as thinking machines that can reason and learn anything, like a human may do, only better, AI essentially focuses on a specific domain, and offer a specific prediction. In the context of work, these predictions may be regarding fitting into a role or team, performance at work, and employee attrition. Since AI helps to reduce the complexity of a business question related to people, it has a growing portion in the HR-tech markets and venture capital.
How accurate is AI?
The easiest way to understand ML is to think about it as a classification challenge. AI automates classification in two major approaches: Supervised Learning, in which AI assigns new observations with existing categories based on matches with the previous examination of data; and Unsupervised Learning, in which AI groups cases based on similarity without specific criteria. But in both approaches, when you automate classification, things can go wrong, and eventually mislead decision-making. And so, the importance of accuracy arises.
How can classification errors in AI be measured? Well, I promised no Math, but at this point, I must mention that predictions or outcomes which an algorithm produces, do have numeric measures of accuracy. The important point here is that these accuracy measures are always less than perfect. In a previous article about predicting employee attrition, I demonstrate in detail some measures of accuracy. But for our discussion now, let’s focus on two terms: Sensitivity and Specificity. Sensitivity relates to the “true positive” rate, i.e., the number of cases that should be classified in a category. Think about employees who fit in a role – we want to make sure that those cases are identified correctly as much as possible. Specificity relates to the “true negatives” rate, i.e., the cases that shouldn’t be classified in that category. Sensitivity and Specificity are independent measures that are assessed separately, but when combining the information from the two of them, some other accuracy measures are created.
Obviously, as a client or buyer of AI applications, you are not the one who should take those measures. But it is your job to be aware of accuracy issues and to expect the vendors who produce algorithms to specify the accuracy for you, just like as you would expect a medicine supplier to report about possible side effects. It is also your job to consider accuracy issues when you base your decision-making about AI algorithms, especially when people’s lives are at stake. When AI mispredicts your favorite song on Spotify, the consequences are not truly harmful. But it is certainly not the case when you are based on mispredictions at work.
Why is AI prone to bias?
There are a lot of discussions about AI that reduces human biases, and parallelly, many discussions about AI that reflects human biases. Both claims are true since bias depends on the data that we use to create algorithms and the way that we use those algorithms. So, how bias occurs? Let’s explore some examples.
ML starts with datasets, and bias might start in datasets too. If a dataset is limited in terms of variability i.e., it does not cover some possible classes of cases, the resulting classification may be biased. The algorithm’s accuracy may be well evaluated regarding general results for the entire population, but it can still be biased for small subgroups. To reduce bias, it is crucial to make sure that the training dataset that is used for building the algorithm is diverse in accordance with both common and scarce cases. For instance, think about the occupational recommendations that are related to minorities. If a training dataset does not include records for minorities, predictions and recommendations about those minorities may be biased. This already happened in reality, at Amazon’s recruitment automation, which was found to be gender discriminant. Furthermore, bias may occur during the feature engineering stage, when you select which variables are included in a model. Feature engineering significantly influences a model’s prediction accuracy. However, while its impact on accuracy is easy to measure, its impact on the model’s bias is not.
Another source of bias may be the subjective evaluations that are included in datasets. I recall my experience two decades ago when I participated as a rater in huge psychiatric research. The preparations for the fieldwork included raters training in order to create calibration within the team. The head of the research, a professor of Psychiatry and the head of the Psychiatry department in one of the largest hospitals in my country, wanted to make sure that every mental symptom that we encounter would be classified the same by every rater in the team. For that reason, he decided to train the team by himself. Interrater reliability is also relevant to AI that is based on the labeling that humans produce. In those use cases, subjective judgments must be calibrated, so the labeled data would not be affected by perceptions and cognitive human biases.
It is also worth mentioning that algorithms and humans are different in the way they react to new and surprising information. Humans are prone to anchoring bias, and their judgments are sometimes so prominent that they tend to ignore or resist contradicting new information. However, when something dramatic is happening, a person may completely change his impression in seconds. Algorithms can’t do that. They will continue to be consistent with classification or prediction for quite some time, even when the new data is essentially shifted. They will catch-up over time, but they can’t change their model at once.
Bias also occurs when we simply use predictions. The idea of a self-fulfilling prophecy is relevant to the discussion about bias in AI because sometimes, an employee or a manager can create the reality that the algorithm is supposed to predict. Think, for example, about an employee who is considered to have a high flight risk according to some algorithm. The prediction of his high probability of leaving the company might influence his manager’s behavior, which in turn will actually push the employee out, even if the prediction was an error in the first place.
These examples are only the tip of the iceberg. I believe that in the near future, People Analytics leaders who will be in charge of the Procurement and Ethics of AI applications in an organization will have to go much deeper into understanding sources of bias. In part 2 of this series, I’ll discuss how people react to AI and how legal frameworks deal with AI.