"With great power comes great responsibility" is an adage that Stan Lee’s Spider-Man popularized in popular comics, films, and other media. In Information Management today, there is great power in systems that leverage Artificial Intelligence for quickly making decisions and processing massive volumes of information. As information managers, we have a responsibility to understand that these systems have a potential for bias and how to identify and mitigate risks associated with producing reliable and ethical systems.
Several years ago, I attended IBM’s flagship conference, THINK, and sat in a session led by a panel of women working in Artificial Intelligence (AI) system development. This panel began discussing how some of the earliest applications of AI had encountered bias within the outcomes. Some examples and case studies showed how the data sets were biased by primarily only containing men or lighter skin tones. There were also examples of how the models produced disparate results for specific classes of people that further harmed already vulnerable people. The women in that session’s panel called for a new mandate for digital ethics to ensure that AI models were fair and just. At the time, AI systems were on the rise, but it was still relatively early in adopting machine learning (ML) and AI into more common methods.
Over the last few years, we’ve seen a dramatic rise in AI/ML models being developed, used, and sold within the Information Management industry. Five years ago, conversations in the industry centered on digitization and automation. AI, or more accurately ML, began being introduced in enterprise-class systems to improve efficiency. As AI and ML have seen more adoption and are becoming easier to develop, nearly every system or application will likely claim some degree of AI within its automation routines.
Today, AI is embedded in many of the systems we use every day, and AI is commoditized with the release of tools like ChatGPT and Bard. We’ve provided the ability for a new AI-enabled era. However, if these robust systems are biased, so are their results. Given the massive amount of data that can be processed from these systems, bias can easily be compounded. If an AI model is being used to determine creditworthiness for home loans and the model evaluates that credit based on the postal code for the property, it could negatively affect a single mom looking to buy in an area near a school and childcare. Suppose an AI model is being used to evaluate treatment options for a patient. In that case, it may make ineffective recommendations for a woman because the data set contained more data on men. In a generative AI model, popular themes or publications used to develop the model may be factually incorrect. Still, the model will continue to repeat the text leading to an illusion of truth effect.
Bias in AI is when the algorithm produces results that are prejudiced due to an error, oversight, or intent in the ML process. Bias can occur in how the data is created/acquired, how the algorithms are designed/programmed, and how the results are interpreted. These systems have both good and bad biases and intentional and unintentional biases. Examples of accidental bias could be stereotyping or unfair outcomes, while there may be deliberate bias to focus on better results and eliminate distractions. Bias cannot be eliminated from any system, but ensuring there are processes to test for, identify, and develop mitigation for bias is essential to the responsible use of AI.
Bias often begins with the data that is used to train the models. These systems require large volumes of reliable data. Information managers must know and trust the source of the data. It should embody big data’s “Vs.”: volume, value, variety, velocity, and veracity. In other words, a large enough sample with enough variety ensures trustworthiness and that the model is trained well. The data should be ethically sourced, including using data that has been explicitly shared vs. open sources. Additionally, the data set must be evaluated to ensure that protected classes of people or information are not negatively reflected in the collection. Information managers need to know where the data used within the models comes from, how it is sourced, and how any data their systems output is used.
Beyond the input data, the models themselves may be biased. Some principles should be applied to AI models. These principles include fairness, transparency, visibility, robustness, testability, and continuous monitoring. The concept of fairness goes beyond the obvious examples, like a credit scoring algorithm being biased against a specific gender or race but also includes ensuring that an algorithm used to create marketing plans doesn’t negatively impact other vulnerable populations like people with low incomes or children. Indeed, ethical usage of AI ensures that the AI systems are robust and testable and that anyone who uses or is affected by them is aware that an AI is making the decisions and allows for a process of challenging the results.
Testing is the key to ensuring that the model isn’t biased. Models should be tested for negative feedback loops and what-if scenarios. Intentionally trying these models with expected failures is essential to end-to-end testing. One of the most powerful ways to test and ensure that the models aren’t drifting over time is to create a monitoring model using AI. In other words, build a machine to monitor the machine. In addition to machine monitoring, information managers need to develop processes for human auditing of the models. Continuous monitoring and frequent auditing are keys to ensuring the models work as intended.
Testing the outputs of AI models also needs to be completed. Key things to look for in trying the algorithm output are drift, performance degradation over time, volume, and consistent accuracy. Beyond these are some challenges ensuring these systems are genuinely responsible. Looking for data intersections that may yield less fair or favorable results is essential. For example, does the model behave differently if a subject has more minority characteristics? Population comparison is a technique that can help identify bias in the output. Looking at the results of one group vs. another can help determine if one or more groups of people are negatively impacted within the model. Here, again, continuous testing and monitoring are critical.
As information managers, we may not be directly responsible for developing the AI models used within our organizations. Still, we are responsible for ensuring that the internal teams that are developing these applications or the vendors from which we purchase them can provide the means for us to audit and ensure that we are being responsible with our usage of AI. Now is the time to ensure that we are responsible risk managers, asking what the worst thing can happen in any scenario and taking measures to mitigate that risk fully. In AI systems, continuously asking those questions and monitoring the results is mandatory.