Data scientists face a struggle – how can you keep the processes behind the inner workings of AI transparent, while keeping the information digestible for the average user? And what does transparency even mean? What additional information do you need to give to users, whether that's how the model was trained, what data was selected, or something else. Finding that balance, and understanding when a trade-off is necessary, is key to cementing trust in AI.
There is an increasing level of scepticism surrounding algorithms making judgments, and there have been calls for greater levels of transparency into these systems. Highly publicized incidents of AI making discriminatory judgments based on pre-conceived biases has led to people wanting to understand the data that helps teach AIs and where that data comes from.
What is AI transparency?
AI transparency allows humans to understand what organizations are doing with our data and how their AI models use the data to inform decision making.
But a model is only as good as the data provided – if a facial recognition model has been built, but it misidentifies faces, where does the fault lie?
Could the model have been configured incorrectly? What if the data that trained the model was biased or limited? People will be depending on this model, but how can it be trusted when there are so many avenues for failure? This is why transparency is so important, yet it’s also difficult to achieve on a level that can be readily understood by end users.
As a consumer of an AI, you don’t have any control over how it's designed, or the data that's utilized as part of the model training process. And now that the market has shifted more towards consumers, visibility over data and design will be required for users to trust them.
When we talk of transparency, its meaning is two-fold – data transparency, and design transparency. Data transparency is visibility into what data is being used and how it's being processed. For example, does it include sensitive attributes or personal information? Design transparency is how the model was built, how complex the system is, and how it is being used to make decisions. All of these areas need to be considered.
Machine learning models depend heavily on the data given to them. Therefore, understanding that data and making that data as transparent as possible is a cornerstone of trustworthy AI. Data transparency refers to visibility into the selection of training data, but also how well that data is labelled once selected.
Being able to verify where the data originated, how it was cleaned, and what features and dimensions the model was trained upon help to increase visibility into the data selection process. By knowing where the data came from, the consumer has a considerably better understanding of the model.
Achieving full transparency
Even if organizations could provide full access to their training data - which is often not possible for commercial or privacy reasons - full transparency isn’t achieved just yet. Engineers may choose to only utilize a subset of that data, or they may have enhanced the data with additional information from outside the training set. So it's important to show how the training data was used as part of the model's development.
Understanding any biases embedded into the training sets is key to a transparent system too. Humans making decisions based on their own judgments and beliefs is a form of informational bias that impacts the data being input into a model. Societal bias is prevalent in training sets and bias needs to be eliminated to truly trust the output.
Of course, understanding bias is only one element that goes towards achieving full transparency. Other uncertainties include how risks are mitigated. How uncertainty is handled within your design needs to be considered. AI brings with it uncertainty. Human behavior is complex, and AI is built on assumptions.
Design transparency – How it was built
Another aspect of transparency that needs to be considered is design. Users want to be able to see how AI came to its conclusions to trust it. But, as the models become more sophisticated and more complexity is required to understand them, it becomes increasingly difficult for even experts to know how it is working, creating a black box scenario.
The solution to making design transparent
Making models less complicated is a great start towards design transparency. By combining both advanced and simpler, interpretable models, developers can provide answers with the advanced model while explaining the logic and the reasoning behind that decision through the simpler model.
Releasing the methodologies and algorithm designs surrounding the model gives consumers necessary context. Designing your model to show its workings when it produces its results.
However, explainable AI (XAI) is also key to developing a system that's fair, safe, and open. XAI is an actively developing research area, but a core one - one that seeks to make black box machine learning something to be understood, not feared, by providing insight into what was driving individual decisions made by the AI system. By taking a considered approach to the data input into the system, it's possible to create a trustworthy and transparent AI.
The possible impact of XAI on the forging of trust between the public and these systems is immense. Explainable systems that conceptualize the decision-making of neural networks bring huge benefits. Researchers recently conducted tests using a self-driving car to try and understand how an adversarial attack on an AI might work, where the model is confused by a subtle data change. By using strategically placed stickers, researchers were able to trick the car into changing lanes despite nothing actually changing. Understanding how robust AI models are is central to having trust in their decisions.
The future of transparency in AI development
Humans need to be able to trust AI, especially when it’s being utilized in high-risk decision-making situations. Allowing AI to come to its own conclusions with zero visibility isn’t acceptable in many cases and making the design and decision processes transparent is crucial.
Certain industries are moving ahead with industry-wide guidance and regulations, such as ISO standards for AI in the automotive industry. An industry-wide solution is necessary to ensure best practice standards are being implemented at all junctures of the AI development process.
Until such a solution is created, ensuring data and design transparency in AI is crucial to gaining trust in the model.