Wells Fargo EVP on the Transformative Power of AI in Banking

An interview with Chintan Mehta, executive vice president and group CIO of digital innovation and strategy at Wells Fargo.

Deborah Yao, Editor

May 9, 2023

9 Min Read

Financial services is known for being a particularly quant-driven industry, which makes it fertile ground for AI deployment.

Chintan Mehta, executive vice president and group CIO of digital innovation and strategy at Wells Fargo, joins the AI Business podcast to discuss how the banking giant thinks about AI and what it does to safeguard against bias and enhance explainability in its data and models.

Listen to the podcast below or read the edited transcript of the conversation.

AI Business: Finance has always been a quant playground, as you know. And so I imagine that AI has found a ready home at Wells Fargo.

Chintan Mehta: I think that is a fair statement. Finance, generally speaking, is more math-oriented and AI tends to have a natural skew towards doing those kind of things.

AI Business: Can you share Wells Fargo's thinking on AI at a high level? And how does this technology align with your overall business goals long-term?

Mehta: The way we see AI is not necessarily like a bolt on, which in some cases, it had been for years. You do your banking, you do digital experiences, and then AI comes on top of it. We are working with a clear hypothesis that AI is going to be a substrate, or a part of pretty much everything that we do, whether in new experiences being offered to customers, in operations, hyper-automation, or in terms of unlocking new insights. … AI is going to become a core component of pretty much all of these dimensions.

AI Business: How long ago did Wells Fargo embark on its AI journey? And where are you now?

Mehta: Wells Fargo has been on this AI journey for probably 10 to 15 years. … But where we are today is we are very intentional about not separating out AI as a different, distinct thing on its own. We tend to think of AI as embedded into the bigger picture … so AI becomes a core component of it.

That also lends itself to how we structure ourselves – not that AI happens on the side over there by some central team and it has to be wired into something that you are doing. Rather, it is embedded into the day-to-day thinking and approaches.

AI Business: When you say you want a more holistic approach to AI at Wells Fargo, does that mean it extends from wealth management to retail services to the trading floor, etc.?

Mehta: It is all of the above. When I say experiences, I mean in the broadest possible sense that you could think of − whether it is giving advice to somebody, or whether it is self-service, or it could even be an independent advisor working with a client. All of that requires a high degree of data analysis and synthesis.

Trading algorithms have been active (in AI) for a long time. In fact, if you see the trading frequency, a big chunk of it is all automated and bot-based across most stock exchanges and have been for a long time. And then there is the angle of personalization and things of that sort for retail banking, where you have a lot of services that you offer as a financial institution (that are customized to the needs of individual customers). ...

Then there are things banks are supposed to do the best, which is authentication, fraud management, KYC, anti-money laundering − all require a significant amount of AI/ML capabilities because it is all built on top of insights.

AI Business: How do you guard against an AI model being biased?

Mehta: AI models by themselves are just pure math so they are not necessarily biased. Rather, (bias could be in) the data that was used to train it, validate it and score it. So a lot of effort has to go into making sure that the data corpus that is used for testing has been checked in different ways, mathematically, structurally, to make sure that it is representative of the actual data that you would expect in the real world.

There is a degree of down-sampling, or imputation, that you do to make sure that you are not preferring one subset of data over another subset of data. There are a lot of techniques - mathematical as well as heuristic - where humans have to review random sampling to make sure that it is actually what we think it is and does not have a downstream bias. That is one step.

Second step is that when you do model development, you also do it as a very independent champion challenger-type study where you can compare it to how things are operating in real world, whether it is done through AI or something else. But those models have to be validated in a real life context without actually exposing it and (causing customer harm) if it were to go wrong. So think of it as like a challenger, but running as an evaluation, making sure that it is doing what it is supposed to do.

While these increase the gestation period for something to go to production, these steps are needed to make sure that you are not creating these runaway black boxes that you cannot control or do not understand.

AI Business: Do you use your own datasets? Or do you also use external ones?

Mehta: It is predominantly our own datasets. And also, for very specific purposes, you cannot use everything for everything. There are legal limitations to what you can do and there are some regulatory limitations as well. And sometimes not all data is relevant to all scenarios. So we tend to be a lot more skewed to internal data, because we have a long tail there in terms of making sure that we are using it the right way. There are some cases where data is augmented from external sources – when we do not have the ability to understand some ambient context on its own from a signal. But I would say those are few and far between. Majority of datasets used would be internal.

AI Business: How do you check the model for AI drift? How do you make sure that it stays compliant?

Mehta: I will break it down into two parts, because I think this is really important.

First, when the model is being developed, we have an independent group called the Model Risk Governance Group – a group of data scientists and researchers who do independent validation of data sources, the models themselves and the way those models are being used for inferencing.

The way they do it is in many cases, they create locally explainable models, they create transfer learning, they do object models, they do variations of different techniques to actually identify if the model meets the basic criteria of bias breaking deterministic outcomes, as well as (check if the model can be simplified or there is a tested and tried solution that could be applied instead.)

On the MLOps, which is what happens when you deploy something into production, the telemetry that it generates in terms of expected outcome, real outcome, and then data drift − all three are constantly reviewed for every model on a periodic basis, ... and the model developers have to independently certify, ‘this was the drift I expected and we have to retrain the model’ or ‘this is not a drift I expected’ … and this model is not doing what it is supposed to and it is a production issue.

AI Business: How do you handle explainability in your AI models?

Mehta: Explainability by itself has been a big research topic across the industry, not just for financial services. If you take a very large transformer, like GPT, or LaMDA, we are talking about hundreds of billions of nodes, the number of weights, architecture, which are fairly complicated - so it is very hard to pinpoint why one is giving out one outcome versus the other.

There are techniques being developed where you can actually break this down into explainable chunks. There is a paper that Stanford published called HELM, or holistic evaluations of language models, which is going to be the next generation of research that we have to do to bring these large language models into production. But by and large, explainability is negatively correlated to the complexity of the model itself. So we tend to have a strong incentive to not use very complicated models.

Another thing to do is deployment of post hoc processes where you can actually take localized sampling of data where you can say, ‘these attributes change and the model behaves like this.’ So you can apply some very basic mathematical differential techniques to say, ‘the local explanation of the weight is valid based on the value of an attribute or the weight of an attribute.’ But in general, if you are talking about large language models, like GPT, or BERT or LaMDA, this is an evolving field so we will be extremely careful about where we are deploying them, how we are deploying them, and what are the guardrails around them.

AI Business: You mentioned ChatGPT and LaMDA. Is Wells Fargo integrating them into some of your models?

Mehta: Yes. It is in beta right now with our employee pilot. It is going to launch out. It uses a large transformer built by Google for language processing, which is comparable to GPT-3. We are not directly integrating with GPT yet, but at some point, these large language models are going to be interesting enough that we will end up using them in some context.

AI Business: Are you going to be using these large language models to improve your chatbot experience for users?

Mehta: The way the model interaction works is that you train these bots for extracting something called entity and intent, which looks at whether it is a text-based or speech-based interaction. Let's say I, Chintan, tell the chatbot to move $100 from my checking account to my savings account. From that phrase, you have to actually extract that Chintan is a user and the checking account and savings account are the two entities and the intent here is to move the money.

So first, you have to unpack parts of speech. Then you have to extract the nouns and other words out of it. What we are doing with the language model that we are using, and it's part of a dialogue flow offering from Google, is to extract those entities and intents. Once entities and intents are extracted, we have APIs on the back of it where we can basically go do those interactions where applicable. Where we cannot, we respond back saying, ‘Okay, I cannot do it on the chat. But here's a link into where you can do it yourself.’ We understand what you are trying to do, except that we cannot do it in the chat container. But over a period of time, my hypothesis is that interactions will be more and more contained within that nonlinear section.

About the Author(s)

Deborah Yao


Deborah Yao runs the day-to-day operations of AI Business. She is a Stanford grad who has worked at Amazon, Wharton School and Associated Press.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like