Moving Towards a Steady State AI Support Industry

    Martin Waller and Gerard Kerr

    A surgeon trained in a medical specialism is expected to make the right decisions in their field almost all the time. But that doesn’t mean they stop learning. They must renew and update their knowledge, learn to apply it to new situations, and adapt as the world around them changes.

    AI can be thought of similarly. AI models are designed, trained, and tested until their trainers are happy they can absorb data and reach reliable conclusions. AI models should therefore work well at the moment of deployment. Following some teething problems, many companies are becoming competent at getting AI to this stage.

    But post-deployment, concept drift occurs, leading the AI to gradually produce less meaningful results.

    This may be a gradual drift over time, or a sudden dramatic change when the AI is confronted with data from new sources from outside its training. At the extreme end, this could have dramatic consequences in areas where we increasingly rely on intelligent systems: misdiagnosing a fatal illness, ignoring cracks in oil pipes, sending a spacecraft off course. A more mundane consequence is users losing confidence in models that required significant investment, and all the opportunity cost that brings.

    IT Support is not currently suited to maintaining AI, but relying on the people who built it is risky.

    Martin Waller, Head of application continuity and improvement services at Tessella

    How does AI change over time?

    AI is not a piece of software that performs the same every time, but a model which learns and interprets. If left unchecked, AIs can change in many ways. These all need monitoring, and AI-driven companies need to be able to call on the right skills to maintain them.

    AI faces several challenges that organisations need to be ready for:

    • Concept drift
    • Data availability
    • Traceability
    • Performance monitoring
    • Model hosting and serving

    Concept drift

    AIs are built to predict a certain reality, but over time that reality may change, and the model may not be able to interpret the new data correctly. A dramatic example is a pandemic, which changes the reality of everything from buying habits to supply chains. But these things also change gradually as habits change and technologies make new things possible.

    Managing concept drift requires checks on the quality of outputs. To monitor concept drift, reporting systems should be implemented that allow users to flag when the model isn’t delivering. These should be coupled with proactive audits of model performance against evolving objectives, followed by retraining, updates, and enhancements as required.

    Data availability

    Organisations are riding a wave of digitalisation. As AI spreads deeper into organisations, and as the world around them changes, new or updated sources of data start to open up. These need to be fed into the model to keep the AI’s outputs relevant. Equally, new datasets may – with careful integration – improve the model’s accuracy.

    This requires new data pipelines being set up and data management to ensure the data has been properly captured and recorded so it’ll be recognised and interpretable by the model. If this isn’t done correctly, it’ll affect the model’s output in unpredictable ways.



    As models become more complex and take in new data sources, checks will be needed to ensure they’re reaching causal conclusions, not just spotting irrelevant patterns. Models can learn something from a new data set based on confounding variables, which seems to work but then fails spectacularly in certain contexts. For example, an AI learning which lung scans contain cancerous tissue because the image was labelled as such, rather than based on the detail of the scan.

    Companies need to be ready to deploy traceability and explainability tools to check what data is driving the output as new sources are fed in to ensure results can be trusted.

    Performance monitoring

    Even in a well-ordered environment, model performance naturally decreases over time as it interprets ever more complex data. Continuous monitoring, testing, and verification – much of which can be done with automated systems – can spot changes that need fixing. Support can then tweak underlying parameters and technologies to get the best performance or identify when a model needs retraining.

    Model hosting and serving

    Models exist in the complex world of enterprise IT. If the underlying infrastructure changes in a way that affects data being fed in, it may have subtle, or dramatic, effects on the model. AI support needs to check that any changes which may impact models are accounted for.

    Shifting from IT support to AI support

    AI is still new and there aren’t too many long-running enterprise AI systems right now. But where AI is being successfully deployed, it tends to either be looked after by the team that developed it or thrown over to the IT team like a piece of software.

    The former makes them expensive to maintain and is reliant on individuals with skill sets more suited to agile data science projects than supporting models within complex IT infrastructure. The latter dumps a tool on a team who may not understand its nuances.

    IT already has steady state support services to check things run as intended. AI needs an organisationally similar setup to ensure it delivers long term. But this must be designed around the needs of AI, which requires fundamentally different skills.

    Traditional IT support is not set up for intelligent systems. Performance monitoring tools will not spot that an AI is producing misleading results since they cannot look inside data and check it is sensible, logical, and accurate. IT support does not need (and so rarely has) the subject matter expertise to understand AI in practice, and to converse with business units to make sure everything is still delivering. The contextual nature of AI makes it particularly unsuited to distant offshore IT.

    Establishing steady state AI support

    So, how do we set up AI support?

    It starts before the AI is deployed. As part of the productionisation process, KPIs for the models should be established, and continuous monitoring tools and reporting processes set up to flag when the model is falling short of these.

    Human backstops who understand the models are then needed to check nothing has gone wrong. Checks and audits should be scheduled at appropriate times (determined by the value and complexity of the model), which look under the hood to check performance, and retrain, adapt, or improve models to changing needs and environments.

    This requires people with the necessary skills to understand and work with AI models. AI in production requires subject matter expertise and data science, data engineering, software development, and infrastructure management skills. It also needs support specific knowledge around governance and service management frameworks. This team must be able to communicate between operations, software, data scientists, and business or domain experts to understand how the AI is performing and to identify what may be causing any problems.

    Underpinning all this should be good governance frameworks, which take the AI from proof of concept into the organisation in a way that ensures its long-term success and serviceability. AIs must be designed in a way in which they can be understood and modified by support teams who understand AI fundamentals, but aren’t necessarily the people who built it. Therefore, standard approaches for model design, reporting, explainability, and deployment must be established.

    It’s time to build an AI support industry

    Steady state support for AI hasn’t had much attention. AI’s status as the cool new kid on the block is a double-edged sword – it’s attracting lots of talent to the industry, but these same people want to be building innovative models, not maintaining them.

    Nonetheless, the AI industry must professionalise. That means doing the maintenance as well as the innovation. Once AI is deployed, we continue to rely on it, and there are genuine risks if it goes wrong. We cannot just wash our hands of it and move on to the next project. Companies leaving gaps in AI support will find they come back to bite them in a few years.

    AI support will likely grow as both an internal and external function, much as IT support did. At some point, it’ll have professional standards and may even become a commodity. In the meantime, the industry is critical but new and untested. Companies deploying AI will need to make rigorous assessments of skills and knowledge to ensure their AIs have the support they need to deliver for years to come.

    Boy with robot

    Subscribe to Our Newsletter