It is time to hold artificial intelligence accountable. An introduction to AI auditing.
As Cathy O’Neill describes in her book Weapons of math destruction , algorithms nowadays influence if we get hired for a job, what college we attend and even our political views. With AI making more and more decisions that affect our lives, we need to look critically at how these decisions are actually formed. We need to know our algorithms are not biased and that they do what they are expected to do. It is time that we start auditing our AI’s.
Auditing is nothing new, we have been doing audits on similar black box practices, for instance in finance. The earliest mention of a financial auditor can even be traced back to 1314. In financial auditing, we are checking financial statements in accordance with specified criteria. We can do a similar thing for AI.
AI Auditing framework
Considering that Artificial Intelligence cannot be left unchecked with its current level of use in our daily lives, the Institute of Internal Auditors (IIA) created an AI auditing framework with guidelines for auditors. The framework comprises 6 components, all set within the context of a company’s AI strategy. These are: AI Governance, Data Architecture and Infrastructure, Data Quality, Measuring Performance of AI, Human Factor and the Black Box Factor.
Building on this framework I want to give you a more practical overview of what an AI audit would cover and what questions your company should be able to address if you are using AI. I will walk you through the different stages of the data science lifecycle and what you should consider at each step.
Data has a huge influence on the outcome of any AI model so it is important to pay attention to the quality and relevance of your data. Where did the data come from, was the source a reliable entity? What are potential inconsistencies with the data sources such as changes in methodology for capturing data over the years, or quality issues of legacy systems? Are there other similar data sources that were not selected to train the model and why? It is also necessary to keep track of changes in the data distribution once the model is running in production to explain deviations in the model results.
Of course, the original data itself is not the only potential source of errors, but the pre-processing step also plays an important role. How, for instance, were missing values imputed and what were the criteria applied for removing missing values? Moreover, how was the distinction between training and test dataset made and what are the potential influences of that decision? Gaining insights into your data and how you altered it before feeding it into your model can help in assessing the model’s behaviour later.
In an AI auditing process, the model is the core element that needs to be evaluated. Both from an algorithm and a code perspective. If you made the choice for a specific AI technique, be sure to know why you chose that technique. Were others considered and what were the reasons those techniques were not selected? In any AI model certain criteria and assumptions are used, make sure to keep track of them and evaluate them. From the code perspective, it is important to keep an eye on what parts of the algorithm were coded from scratch and which were taken from other sources. Are those sources reliable? And if pre-existing programming packages are used, are they trustworthy? Especially open source code can be infected by malicious parties.
When the model is tested after training, certain test criteria need to be decided upon. This is highly dependent on the use case but it should always be explainable where your benchmarks are coming from.
Model deployment is an important aspect of the data science lifecycle and can have a large influence on the behaviour of a model. Therefore it is imperative to check how the model was moved into production and review the accuracy and workings of a model after implementation. This is also the moment to apply different test scenarios. How does the model behave in these scenarios? Does it accomplish the goal it was set out to do at the start? This ties in with the explainability of AI as well, which you can read about more in our previous article.
Monitoring results and processes
Now that the AI model is deployed it will start generating results. There should be a proper structure, process or procedure in place to manage and monitor the AI activities. Does the AI continue to behave as planned or is it drifting? Are users interacting with the AI in the way that you expected? In an AI auditing process an event log will often be requested to check what has happened with the AI recently and to verify that that is normal. The use of AI should also comply with relevant laws and regulations and maintain an appropriate level of ethical and social responsibility. Consider what regulations apply to your AI activities and how you can comply with them. Since this can be tricky, it might be a good idea to involve external parties that are also focusing on AI auditing like PwC or KPMG.
AI auditing is still very much in its early stages but with greater scrutiny on AI and its increasing use, an auditing procedure is inevitable. The EU is already looking into the need for AI auditing for instance. The decisions of AI models will shape part of our future. We wouldn’t find it acceptable for a human to make such decisions without any oversight, so why should we find it acceptable for computers?
Learn more with UbiOps
Are you exploring AI auditing for your company or are you just interested in the topic? Go ahead and reach out to us for more info and to share thoughts. Would you like to learn more? Sign up for a newsletter and stay up to date with the latest news.