How Can We Trust AI to Make Life-and-Death Decisions When it Keeps Doing Stupid Things?

    Written by Matt Jones for Forbes



    AI promises to run critical aspects of our lives, from diagnosing disease to deciding when a plane needs repairing. Yet we constantly hear stories about AI making mistakes that would be absurd for humans to make, such as the Amazon AI program that designs phone covers with pictures of heroin needles and men in nappies.

    How can we reconcile these two visions of AI - one a trusted advisor of the world’s most important decisions and the other a form of AI that seems totally confused by the world around it?

    The point here is that AI is a set of tools that can be used well or badly. The difference between successful AI programs and ridiculous ones is that successful ones are built for the task at hand and trained on well-defined, well-curated data sets. If our AI is designed to spot signs of cancer in medical scans, we build it to analyze certain types of images, feed it many scans and tell it which ones have signs of cancer so it learns which are which. These data sets are curated by experts to ensure they fit certain parameters and are properly tagged to show what part of the image indicates signs of disease.

    If, on the other hand, we cobble together an AI to design phone covers based on keywords or search terms and set it loose on some of the internet’s vast uncurated data sets without adequate training of what it should be looking for, it will quickly pick up some unexpected results. The same happened to Microsoft when it designed a chatbot designed to learn from conversations; unfortunately, it quickly picked up some nasty prejudices.

    The Right Tools And Data

    AI comprises of a set of tools, not a single solution. It encompasses machine learning, neural networks and image and language processing. To design an AI that works well, you need to understand the problem and select the right combination of tools.

    Then you need to select your training data. The more you understand and curate the data, the better you can train your AI using it. We don’t know the full story of the phone cover designer, but it's likely it was set loose on someone else’s data, with some broad guidance of what type of images to look for, and the first things it found reinforced its sense of what was right.

    Most AIs that work well use well-defined data sets. The data from a jet engine or medical scans is vast and complex but fairly well understood by experts in the respective fields. Clear parameters can be set by said experts who know what they are looking for. That’s not to say we can’t do useful things when you aren’t controlling the data, but this needs very careful training and ongoing oversight to address the potential for bias to be introduced, and the insights are likely to offer broad guidance rather than proven facts. That said, setting an AI loose on data and seeing what happens can be very informative and help you refine it to do what you want, similar to chemical R&D. Researchers start with a broad range of possibilities and gradually hone in on the one that works. This should be done at the test phase by people who know how to interpret the results, not at the point it’s launched to the public.

    The most complex AI application in terms of breadth and messiness of data is self-driving cars. Here we see the difference from the scenario with the AI-designed phone cases. Nobody launches a self-driving car and sees what happens. The algorithms are designed by experts, tested and modified. Where AI makes mistakes, the developers change the parameters to exclude that group of mistakes. Tools such as anomaly detection are used to clean up the data before it is used for training, and vast teams of curators are recruited to tell the AI where it is correctly or incorrectly interpreting the world around it.

    A Bad Workman Blames His Tools

    Many people who look at AI embarrassments conclude that AI isn’t ready for serious problems. But when taken seriously, and built by experts who understand the toolsets, AI does solve real problems. People are less interested in AIs that predict whether pipes are corroding or when to service jet engines than in amusing phone covers, but that doesn’t mean AI is not delivering huge commercial benefit.

    A lot of the reason for the divide is the attitude toward risk. Phone cover manufacturers can afford to take a punt on an AI to help them sell more products. Companies maintaining vast engineering infrastructure or diagnosing disease cannot and do not -- they work with experts to design the right AI and rigorously train it and test it before using it on new applications.

    There are a lot of people who have learned how to use some AI tools and found some interesting applications but don’t have experience in the industry whose problem they are trying to solve. Many new AI products work well in the lab but make mistakes when released on real-world data that was not properly understood by its developers. This creates risk amidst the rush to develop the next AI gamechanger.

    Some of these approaches will end in failure, and as long as that failure doesn’t cause serious harm, that is a healthy part of the learning process. But we should also acknowledge where AI is doing well and what factors make it successful: the right tools, for the right problem, trained on the right data.

    AI will change the world. Those using it for serious applications will take it seriously. Pointing to the phone cover AIs and saying, "How can we trust AI to save lives?" is like pointing to a collapsed house and saying, "How can we trust people to build skyscrapers?" Some people do a good job, some don’t, and generally, the higher the risk/reward, the more quality controls go in. 

    Written for Forbes by Matt Jones, Lead Analytics Strategist at Tessella.

    Source: Forbes