Eliminate Bias for Ethical and Responsible Artificial Intelligence

By: Candace McCabe, CIP, CIPM, AIIM Fellow on August 15th, 2023

Eliminate Bias for Ethical and Responsible Artificial Intelligence

"With great power comes great responsibility" is an adage that Stan Lee’s Spider-Man popularized in popular comics, films, and other media. In Information Management today, there is great power in systems that leverage Artificial Intelligence for quickly making decisions and processing massive volumes of information. As information managers, we have a responsibility to understand that these systems have a potential for bias and how to identify and mitigate risks associated with producing reliable and ethical systems.

AI Bias Recognized Early On

Several years ago, I attended IBM’s flagship conference, THINK, and sat in a session led by a panel of women working in Artificial Intelligence (AI) system development. This panel began discussing how some of the earliest applications of AI had encountered bias within the outcomes. Some examples and case studies showed how the data sets were biased by primarily only containing men or lighter skin tones. There were also examples of how the models produced disparate results for specific classes of people that further harmed already vulnerable people. The women in that session’s panel called for a new mandate for digital ethics to ensure that AI models were fair and just. At the time, AI systems were on the rise, but it was still relatively early in adopting machine learning (ML) and AI into more common methods.

Over the last few years, we’ve seen a dramatic rise in AI/ML models being developed, used, and sold within the Information Management industry. Five years ago, conversations in the industry centered on digitization and automation. AI, or more accurately ML, began being introduced in enterprise-class systems to improve efficiency. As AI and ML have seen more adoption and are becoming easier to develop, nearly every system or application will likely claim some degree of AI within its automation routines.

Today, AI is embedded in many of the systems we use every day, and AI is commoditized with the release of tools like ChatGPT and Bard. We’ve provided the ability for a new AI-enabled era. However, if these robust systems are biased, so are their results. Given the massive amount of data that can be processed from these systems, bias can easily be compounded. If an AI model is being used to determine creditworthiness for home loans and the model evaluates that credit based on the postal code for the property, it could negatively affect a single mom looking to buy in an area near a school and childcare. Suppose an AI model is being used to evaluate treatment options for a patient. In that case, it may make ineffective recommendations for a woman because the data set contained more data on men. In a generative AI model, popular themes or publications used to develop the model may be factually incorrect. Still, the model will continue to repeat the text leading to an illusion of truth effect.

AI Bias Defined

Bias in AI is when the algorithm produces results that are prejudiced due to an error, oversight, or intent in the ML process. Bias can occur in how the data is created/acquired, how the algorithms are designed/programmed, and how the results are interpreted. These systems have both good and bad biases and intentional and unintentional biases. Examples of accidental bias could be stereotyping or unfair outcomes, while there may be deliberate bias to focus on better results and eliminate distractions. Bias cannot be eliminated from any system, but ensuring there are processes to test for, identify, and develop mitigation for bias is essential to the responsible use of AI.

Bias often begins with the data that is used to train the models. These systems require large volumes of reliable data. Information managers must know and trust the source of the data. It should embody big data’s “Vs.”: volume, value, variety, velocity, and veracity. In other words, a large enough sample with enough variety ensures trustworthiness and that the model is trained well. The data should be ethically sourced, including using data that has been explicitly shared vs. open sources. Additionally, the data set must be evaluated to ensure that protected classes of people or information are not negatively reflected in the collection. Information managers need to know where the data used within the models comes from, how it is sourced, and how any data their systems output is used.

Beyond the input data, the models themselves may be biased. Some principles should be applied to AI models. These principles include fairness, transparency, visibility, robustness, testability, and continuous monitoring. The concept of fairness goes beyond the obvious examples, like a credit scoring algorithm being biased against a specific gender or race but also includes ensuring that an algorithm used to create marketing plans doesn’t negatively impact other vulnerable populations like people with low incomes or children. Indeed, ethical usage of AI ensures that the AI systems are robust and testable and that anyone who uses or is affected by them is aware that an AI is making the decisions and allows for a process of challenging the results.

The Importance of Testing AI Models

Testing is the key to ensuring that the model isn’t biased. Models should be tested for negative feedback loops and what-if scenarios. Intentionally trying these models with expected failures is essential to end-to-end testing. One of the most powerful ways to test and ensure that the models aren’t drifting over time is to create a monitoring model using AI. In other words, build a machine to monitor the machine. In addition to machine monitoring, information managers need to develop processes for human auditing of the models. Continuous monitoring and frequent auditing are keys to ensuring the models work as intended.

Testing the outputs of AI models also needs to be completed. Key things to look for in trying the algorithm output are drift, performance degradation over time, volume, and consistent accuracy. Beyond these are some challenges ensuring these systems are genuinely responsible. Looking for data intersections that may yield less fair or favorable results is essential. For example, does the model behave differently if a subject has more minority characteristics? Population comparison is a technique that can help identify bias in the output. Looking at the results of one group vs. another can help determine if one or more groups of people are negatively impacted within the model. Here, again, continuous testing and monitoring are critical.

The Responsibility of Information Managers

As information managers, we may not be directly responsible for developing the AI models used within our organizations. Still, we are responsible for ensuring that the internal teams that are developing these applications or the vendors from which we purchase them can provide the means for us to audit and ensure that we are being responsible with our usage of AI. Now is the time to ensure that we are responsible risk managers, asking what the worst thing can happen in any scenario and taking measures to mitigate that risk fully. In AI systems, continuously asking those questions and monitoring the results is mandatory.

About Candace McCabe, CIP, CIPM, AIIM Fellow

Candace began her love affair with all things data and information in one of her earliest roles in her career. As a project manager responsible to her customer in a software development company, she quickly learned to understand her customer’s data needs and communicate those clearly to her software development partners. After serving as the Sr. Information Architect at JB Hunt Transport and leading their enterprise modeling effort, she moved into Enterprise Architecture with a minor role in Information Lifecycle Management. She championed the creation of an Information Governance Program that included Information Lifecycle Management, Records Management, eDiscovery, Legal Hold Management, and the supporting technologies for those efforts and she became the creator and owner of the program. As an extension of her IG work, Candace expanded her efforts to promote, create, and manage JB Hunt’s first Information Privacy program. In that effort, she also included her knowledge and network in IM/ IG. Candace joined Walmart in late 2019 and began working in the Governance, Risk, and Compliance area, supporting Disaster Recovery as a Sr. Solution Architect. She continued applying her IM/ IG knowledge, which has expanded into an information risk management practice, and applying IM/ IG to her current role in Enterprise Architecture. She is currently working to apply IG principles and serving as the primary information architect in an effort to develop accurate and actionable views of the dependencies between technical assets. Candace has always been a proponent for improving the lives and careers of women in information and technology careers. She served as a member of the WIIM leadership council, mentored multiple women in AIIM, started the Women In IT resource group at JBH, and founded a book club and a wellness mentoring circle through Walmart’s Women in InfoSec resource group. She continues to mentor and lead in areas related to women in information and technology careers. As an AIIM Fellow, Candace is working to champion diversity, specifically for women, within AIIM. Candace was also one of the authors of the best-selling risk management book, Flip This Risk For Enterprise Security, which was featured at AIIM 2022 during the keynote session featuring Dr. Karen Hardy, the coordinating author.