Dell UK CTO: Improving AI model accuracy by mitigating drift
An in-depth interview with Elliott Young on best practices to follow
An in-depth interview with Elliott Young on best practices to follow
AI models can offer superior analytical, predictive and prescriptive capabilities at scale compared to legacy systems, but they also can be more finicky to use because they need to be constantly monitored against drift.
AI Business recently spoke with Elliott Young, Dell U.K.’s chief technology officer, for our podcast to talk about the importance of mitigating AI model drift, what’s an acceptable drift threshold and best practices to follow. Young also revealed what keeps him up at night.
What follows is a shortened, edited version of that conversation. If you wish to listen to the podcast, click on the player below.
AI Business: When implementing AI, there’s something called AI model drift. What is it, why is it important and why do organizations need to plan for it?
Elliott Young: AI drift is one of the things that gives people a bit of shock when they discover that such a thing is possible. From the technical point of view, there are two types of AI drift. The first one is when the thing that you're trying to predict, the target, starts to drift or starts to change in the real world.
Let’s say you had a number of chicken restaurants across the country, and you wanted to predict how much fresh chicken you would ship to each chicken restaurant. You've got a model that may have been working perfectly well for the last couple of months. But then somebody opens a brand new chicken restaurant in (the area). If the model is not aware that has happened, then its predictions are not going to take that into consideration, and they've got an example of target drift.
The second type of drift is with the inputs that the AI is looking at, and their conditions change.
(For example, before the pandemic) there are AIs that predict how likely is it that a bag in an airport is going to be lost. It's trundling around on a conveyor belt. And as it goes through various sensors, we're working out where it is, and trying to make a decision on where's the best place to put this bag so it doesn't get lost. Do you put it at the top of the container or bottom of the container? Do you put it on a plane early or late, etc.?
Now, in the pandemic, all the signals that were coming in − where the aircraft are, where the containers are, how many bags are coming through − dropped to zero overnight. … The data that you trained your machine learning-type model on … those signals have suddenly dropped away.
The way I often describe this (AI model phenomenon) to business leaders is (to compare it to legacy systems). Let's say you're in sales, and you ask your IT department to make for you a sales order entry system. When you build a traditional order entry system with a relational database, and an application server and a web server sit in front of it, that system on Day One will be good at saving a record. … Even five years later, it may have slowed down but it's still pretty good at doing that thing that you built it for.
Figure 1: Dell U.K. CTO Elliott Young
That is absolutely not the case with a machine learning-based AI. If you think that you're going to build a system, for example, that can work in a contact center and listen to incoming calls of people who are applying for a loan and then make a prediction, say, how likely is it that this person is going to pay their loan back, that kind of system might have fantastic performance on Day One, but by the time it's got to Day 30, some kind of drift may have happened. Twelve months later, is it really likely that AI is just going to be as good as it was on Day One at making predictions?
For many businesses, that's a bit of a shock because historically what we would do is we would pay for an IT system to be built. The budget would all be used up, the thing would go live and that'll be it. There'll be a small support budget afterwards, but there wasn't this concept that you constantly have to monitor the system in a particular way.
(With AI models,) you might have to feed and water it, you might even have to retrain the model, you might have to take the resources that are already committed on building the next thing and bring them back to work on a system that you thought was fine; it was in production six months ago. So why should they have to come back to it? For many business leaders, this new way of creating IT is a bit of a shock to them and they need to get their head around the requirements of what they're being asked to do.
AI Business: Why does drift happen versus legacy systems? What is it about AI that introduces the concept of drift?
Young: Let's take the example of supervised machine learning. What I mean by that is, we take a history of events, and we give those to the AI and say, ‘teach yourself about what happened here, what went right, what went wrong, and come up with either some kind of numerical score that predicts something in the future, or maybe a category that says, what's this customer going to buy in the future.’ There are various types of models. The key point is that when you start off building this kind of AI, you are taking a snapshot of data at a point in time.
Let's say, I want to predict all of the customers in the restaurants who are going to buy a particular type of pizza. I would take all of the transactions for the last two years for all the people that went into pizza restaurants and I train the AI based on the different categories of pizza that they bought. If something is going to change in that environment − we introduce new pizza or there's another lockdown − that training data has to be changed. You have to somehow get some mechanism to feed that back into the AI.
In a large organization, there's this huge process that goes on to collect the data so that AI can be trained. But if that data collection process goes wrong − the Extract, Transform, Load (ETL) process − and you don't know about it, then actually what you're doing is you're teaching the AI the wrong thing.
"For many business leaders, this new way of creating IT is a bit of a shock to them."
- Elliott Young, Dell U.K. CTO
Now, it may look like the ETL process is working every day; it's getting 10,000 records into the database, and the next day, it gets another 10,000 records, and so on. But maybe in the last week, two of the columns actually sent through blanks. It would look like your ETL process is working totally fine, but now all of a sudden, the AI has completely drifted in the predictions it's making because you've actually got a hole in your data.
In the past, it would be really obvious if there was a problem with the ETL because you wouldn't get any sales orders coming through, or any purchase orders or invoices being sent out and the customer starts screaming, ‘Hey, where's my where's my invoice? Where's the expected deliverable?’
But with an AI, how do you know that two columns dropped out and the AI is suddenly making different predictions? If you're in the situation where it's making predictions in a contact center for people applying for loans, all of a sudden, it starts reducing the number of loans it is giving out by 25%. If that's the first time you realize there's a problem, that's already had a huge impact on your business.
So these are some of the considerations to bear in mind. You have fantastic possibilities of what it can do, but also be aware of conditions that can happen that could impact your business.
AI Business: There’s best practices for data management, data cleaning. Is that all it takes?
Young: No, I don't think so. We have to be careful when we're asking whoever it is to build this model for us that you also put in the requirements − there has to be a way to monitor it. From a business point of view, it would be very undesirable to have a kind of black-box solution where the AI made up its own rules about what it's going to do and the humans don't understand or see why it's taken that decision. Yesterday, it was giving out loans to 80% of the applicants, but today it's 20%.
… You have to find some way of building into the solution a way of inspecting itself. A typical way that that can be done is by the introduction of this thing called a challenger model. You would train up a model to make predictions, but in the background, you take the second best model you can find and it also does the scoring in the background.
… If you monitor those two over time, you start to see if they are coming up with different answers. And you can put in place a threshold that says, ‘send me an alert if the threshold in this particular area is breached.’ This gets into this concept of understanding what the predictions are now. Afterwards, (take actual, detailed information) and if you can find a way to feed that back into the AI, you've got a measure of accuracy.
So it's a combination of these two things. One, is it drifting? Do I care about the drift? And the reason I might care is if accuracy is going down. Then go to the next level of detail to understand why the drift is happening. Is it happening on a feature that is actually important to me and is it affecting my accuracy?
AI Business: What’s an acceptable threshold for AI model drift?
Young: (It depends.) Let's say we're in a time-critical situation where you've trained up your model, and now you've put your model into production, and you're sending it data, and you want a quick response back.
The use case for this might be undesirable email − there's somebody sending undesirable email to your employees. One of the things you can do is … send the text of the email to an AI and have the AI predict how likely is it that this email contains spam, virus or a CryptoLocker (ransomware). Now, in that particular case, it will have a very low acceptable rate of drift. … This is a time critical process because you have to make a decision if you're going to release that email and you don't want to slow down people's email.
If we are looking at a propensity model of, say, who's going to buy a particular kind of laptop next month, then typically that's a one-off batch process where you just get a bunch of answers out for this particular target list of accounts. Then you can take a decision on that and over time, it's a slowly changing situation.
But somewhere in the middle, there's going to be this kind of threshold. And the way you work that out is you say, in my data set, I've got these features and they are broadly similar to the number of columns in a spreadsheet. Let's say there are 10 columns in the spreadsheet, and there are 1,000 rows in the spreadsheet. And this is going to be your training data.
"You have fantastic possibilities of what it can do, but also be aware of conditions that can happen that could impact your business."
- Elliott Young, Dell U.K. CTO
When you do your Data Prep, those 10 columns probably stretch out to about 100 columns. And then the AI has got something it can work with. Maybe only seven or eight of those columns will be really critical features. And if you've got a good enough MLOps tool, you will be able to see that feature number 26 drifted but I don't really care because it didn't have a big impact. But feature 12 actually has a massive correlation between outcomes and therefore I absolutely care if that particular feature starts to drift. And that's where you need the insight on what your AI is actually doing.
AI Business: How long does it take before you can you see the AI model drifting? Do you check, say, every 30 days?
Young: We check every few minutes in Dell. But that's only because we use fully automated systems. I don't have a massive team and so we have to automate absolutely as much as possible. … If I had, like, 50 data scientists, that would be awesome. I wish I had that.
From the business users’ point of view, we have a dashboard where you can see all of the deployments. It’s updating every couple of minutes, it shows how many requests came into that particular API, what were the responses that gave out? Was it drifting? And in the background, we also have a process, which is doing retraining.
There's two types of retraining. … I know that my training data is changing every day. So what I'll do is I'll give (my AI model) access to the new data that's coming in every day. You can have an automation using MLOps in the background that just retrains the model that you have and tries to keep it tuned in so it can react to the drift.
The second thing you can do, … (is to have a leaderboard) for AI models. So for each ML model that we build, we probably evaluate about 100 different combinations of all the different models, and we let them compete. We release different parts of the training data to them and see who can process that training data the best and whichever is the best at that level then goes through to the next level until you get to the stage where you find the best AI model.
We actually keep that process running in the background automatically on a daily basis. And if a new model bubbles to the top … we get an alert on the dashboard (that nudges us to) consider adopting this new AI because it is suddenly doing a better job than the one you've got.
AI Business: Do you have to retrain the AI model completely or just a portion of it or?
Young: No, there's some AIs that we that we never retrained, we use them once and that's it. Let's say we were building a propensity model to enable marketing to target a new campaign on social media. Typically, we would take a snapshot of the data as it is, and then we'd work out what's the thing we're trying to predict. Are we trying to predict a one-off event? Or in the future this customer will buy this kind of storage? Or this account is going to spend this much in the next six months?’
The secret to building these things is actually getting the business question right. Once you're confident you've worked out the right business question to ask, then in many cases, we will train up an AI just to answer that particular question. We'll take the results, and we'll probably stick that in the library and probably not come back to it again for another six months.
So there's a difference between the one-off benefit that you can get for certain activities versus an ongoing, constant prediction stream coming up.
AI Business: What keeps you up at night as the CTO of Dell U.K.?
Young: What keeps me up at night is all of the possibilities of things that we could be doing, which we'd like to get to.
One of the one of the most frustrating things about working in IT companies if you see amazing, capable technical capabilities that we haven't yet spoken to the market about. And you just think if that customer over there actually uses this, they could be doing fantastic things. And so the challenge is about making customers aware that (capability) exists and getting it to them as quickly as possible.
That's a more difficult thing than you might expect because people have a way of working that they've been doing for maybe 10, 20 years in their career. And you come along and say, you need to try out this new approach, or consider these new things you never thought before.
… That's what keeps me awake at night is there's so much possibility out there that’s not being used.
About the Author
You May Also Like