Arthur Hu cuts through the hype of generative AI deployment in the enterprise and observes that running inference might be costlier than expected.

Deborah Yao, Editor

September 13, 2023

23 Min Read
lenovo logo

Arthur Hu, the global chief information officer at Lenovo and chief technology and delivery officer of its Services and Solutions Group, joined the AI Business podcast to kick off the series’ Season 3 premiere.

In the podcast, Hu cuts through the hype of generative AI deployment in the enterprise and shares that running inference on trained models might be more expensive than expected, among other insights.

Listen to the podcast below or read the edited transcript below.

AI Business: Can you tell our listeners a little bit about what you do? I understand you hold dual roles at Lenovo.

Arthur Hu: I've been the chief information officer for the company for the last seven years. Then in the last year, I have also become the chief technology and delivery officer for our Services and Solutions group or SSG. In the IT role, it is about driving the company's transformation and digitalization. In my newer role as the technology and delivery officer for SSG, that's really about building the delivery engine for services for the company and then using technology to enhance our offerings.

AI Business: At a high level, can you tell us about Lenovo’s key investments in and initiatives around AI?

Hu: AI is something Lenovo has been committed to for the better part of a decade. And this is something that we certainly saw coming early. We've really developed and applied AI across our business. When I review the breadth and depth of how we have integrated AI to help power our business, it’s really become everywhere in the business and I'm actually hard pressed to find any function or geography or region that's not using it.

That spans our marketing, our sales, our customer services, our supply chain, and manufacturing. That's the result of a very concerted effort over time to make sure we're getting the benefit of AI in all the appropriate areas of the company. Now, specifically, more recently, we've committed to investing over $1 billion in the next three years so that we can accelerate both our internal AI deployment as well as enriching our own offerings that are going to be AI-enabled, and AI ready.

There's a lot of AI implementation, there's a lot of excitement there, but there's also a lot of noise that we need to help our customers simplify, especially as we're looking at new sources of data, new computing architectures that include not just the traditional client and cloud, but increasingly, the edge woven together through sophisticated network solutions and with AI embedded, so that $1 billion is both for internal implementation, as well as creating solutions and an AI-ready portfolio for our customers.

Now, specifically on generative AI that's really burst into the public consciousness in a very public way, … we've committed a further $100 million dollars to develop the AI innovators program at Lenovo. This is to tap into the very rich ecosystem of software partners that are working with Lenovo to deliver ready-to-deploy AI solutions across computer vision and prediction, security, and virtual assistants. So I think from our long term commitment to AI, as well as to generative AI in particular, we're getting ourselves ready both internally as well as externally in making these solutions available to our customers.

AI Business: Can you give us a sense of the level of adoption or interest from your clients or customers in generative AI solutions?

Hu: Right now, it's certainly one of the hot topics. When we look at our CIO research, AI and machine learning and generative AI − it's so new − is really driving increased focus in this area. (Gen AI) is actually absorbing a disproportionate level of attention as well as the focus of proofs of concept that many companies are looking to do. That's very natural because there's so much hype around it that everyone is exploring to see what's actually underneath the hype. … This will be one of the topics that takes up a lot of the attention and executive discussions and even board discussions.

When I meet CIOs in the industry, every single one is working with their business on identifying what is the portfolio of generative AI and how it can actually be impactful. Now, at the same time, as much as there is interest, CIOs are the ones that are charged with deploying it and creating value. And so I also hear from customers that are a little bit nervous that people have exceedingly high expectations about what generative AI can do.

There's also an aspect of deploying and harnessing part of that attention for education, because what we don't want to do as an industry is have people have overly high expectations and then ultimately become disappointed. We want to mitigate some of the highs and channel that into fully understanding this technology so that we can more quickly get on with the business of deploying it properly. It’s an exciting technology but at the end of the day, it's still a tool that has its uses. And it also has the areas where it may not work so well.

AI Business: Can you walk me through a little bit of the conversation you or people under you have with customers in terms of painting a practical picture of what generative AI can and cannot do?

Hu: This one is actually playing into the strengths of CIOs and technologists around the industry. At its base, generative AI is really going to be a prediction engine. One of the analogies that seems to help business leaders grasp this quickly is that in its current form, generative AI is a super autocomplete. That applies to a press release, memo, an email response and even applies to code if we're talking about the large language models, or LLM, with ChatGPT, GPT-3 and 3.5 that have generated so much excitement.

At the heart, it's really a probability engine that takes the input that you give it and then generates output it imagines you want. In that context, you begin to see that it may not be the best option for (use cases in which) you need an exact answer every time. If you are doing a very precise business analysis around financial numbers or modelling and you need a deterministic answer (that does not change), then generative AI is inherently less suited for that because of its probabilistic nature versus being deterministic.

Another example would be giving it too much freedom for independent judgement. Because it has such an enormous breadth of input, it is very convincing in conversations, creating de novo or from scratch content for you. It, of course, has hallucinations. … (The output) is untrue (but) it doesn't mean the algorithm is not working. Where the light switch goes on for business leaders who maybe read a memo that sounds very convincing but is entirely false, is it’s important for them to understand … it's how the algorithm works. Once you have those insights, you can begin to construct a better mental model for how to use this.

One of the analogies that seems to help business leaders grasp this quickly is that in its current form, generative AI is a super autocomplete.

The final thing to round this out are discussions we're having with our customers around security. This happens on a couple of levels. First, there have been multiple public incidents where inadvertently employees that were experimenting with it leaked internal company information into the public domain. When you use a publicly hosted large language model, it's quite possible they're taking your input to further enhance, and train and refine their algorithm. Unfortunately, when you do that, then whatever input you gave it may go into the public domain.

The other part of the discussion that's quite interesting, and also quite meaningful is around the data sovereignty. Many more of the regulatory regimes across the world are becoming much more specific about how they want to see consumer data protected, how they want to see certain sensitive data treated.

AI Business: Given your comment that generative AI is probabilistic instead of deterministic, what kind of use cases do you think it is best suited for?

Hu: We focus on AI’s augmentative aspect. … It is something that is with you as a way to augment what it is you're doing rather than replace you. So the augmentation aspect of using AI has been very important for us. And when we frame it that way, then it's really about helping accelerate tasks significantly, especially ones that would require content creation.

We see in the software engineering realm, the ability to improve productivity in writing code, we see improvements in the ability to check if that code is accurate, as well as the test results. We see in areas that require text generation, to have generative AI draft pieces of business communications, whether it is internal memos, email responses, even press releases. It could be a very good first draft that then our teams are able to iterate more quickly. So those are very much in the assistive and augmented mode.

Generative AI has also been very good around more nuanced and multi-layer intent understanding. So in this realm of delivering services and being responsive to customers, it's a potentially very good way of improving customer experience. Many of us as consumers have had the frustrating experience of a very simple chatbot where if you even have a single word out of place, or misspell it, it won’t know what it is you're talking about. The large language model-powered chatbots are much more capable of having human-like conversation and understanding.

AI Business: Given that you said generative AI is not deterministic, does that mean that overall, generative AI will have less impact in transforming a company's operations than what people expect?

Hu: I think that's to be determined. I'm not leaning either way. We are just at the very beginning of understanding how generative AI would rewrite the operating manuals, so to speak, for companies. … In operations, there are some aspects where, for example, a part is a part, is a part. You don't want that one part to be a different part, you don't want it to be five parts. For us, we don't want three computers to turn into one computer in the systems. So there are absolutely aspects where there are very good tools already that do that.

Let me give you another example to illustrate. All companies run significant estates around computing and storage, whether that's in the form of on-premises, traditional data centers, private clouds, public clouds, or some mix of hybrid multi-cloud machines. The servers and storage and networking devices are famous for generating a massive amount of data or telemetry coming off them, alerts, updates, things that pass the threshold, that signal an event of some interest might have happened, changes in the machine operating state. And that's something that's already been quite powerful in helping companies manage those significant estates of computing effectively.

Now, one of the things that we're always wrestling with, because there's so much data, is how to make sense of that. So I think it is very promising in the operation space for administering and managing the cloud, both public and private, to apply generative AI to understand and ingest the massive amounts of data that are coming off, and then really find the trends that are happening. How do you do the correlations? What events seem to be associated with stability? What kind of trends and underlying occurrences seem to be associated with incidents or customer service disruption? And so here, it's probably wise to take a segmented view for things that are already working very well.

There are some areas of operations where you don't necessarily need to use generative AI because we have perfectly fit-for-purpose solutions. But there are areas that are totally untapped around understanding and correlating across large knowledge bodies and bases of data − and we can take those data streams and use generative AI to extract and get much more insights that are going to be beneficial for improving and even redesigning operations.

AI Business: What would be your advice to other CIOs when they're looking at this wild west of generative AI applications, solutions, models and platforms? How do you make sense or organize all of that?

Hu: The first lens would be around the business. Lenovo, for the better part of the past decade, has been driving adoption appropriately to each part of our business.  It's important to make sure you bring in the business side because they're the ones who are ultimately going to help you identify the high value cases, and also provide the proving grounds to see what is or isn't working. The flip side of that is making sure that you avoid the technology teams just doing a lot of PoCs that are purely technical, but less grounded in what the business cares about. So point number one is bring the business in early.

The second part is to invest the time to help the business side understand it so they can have more meaningful conversations. For example, I was just in front of one of our senior leadership forums, showing demos, not just explaining theory but showing what software engineering augmented by generative AI looks like −  showing how exactly we're applying this technology to improve our AI operations in the infrastructure management space, how exactly multilevel and nuanced intent, identification leads to a better experience. I really can't understate the need to keep the business side educated, because then you make them much more powerful partners, able to accelerate and say, ‘yes, this works or doesn't,’ and join me identifying the things of interest.

Thirdly, (from a technology viewpoint), it’s important to view this and understand that it is early days. Even for my team, some committed very early and said, ‘Oh, this is the model we should use.’ But because it's so early, there are things that are changing every day, every week. Last November is when (generative AI) really burst onto the public scene. Of course, (ChatGPT-maker) OpenAI has been working for many years on this but in the public consciousness, this is really new.

With the introduction of the Llama model, for example, which Meta put into the public domain, there are just so many experiments going on, and what is at the frontier of what's possible is advancing in a gratifying way. That's really fun to see. There are things happening around the ability to bring and compress large language models to become medium language models, or even small language models that can fit either at the edge or even on a client device and a device like a phone or a computer. There are massive advances that are happening there as well as in the strategies for making these large language models much more tailored to a specific context.

And so on the technology side, I think it's absolutely important that IT teams and technology teams set up their architecture so that they're able to tap into the ecosystem, that they don't commit believing there's only one path forward − because there's so much happening and I think everyone would benefit by keeping abreast of the various developments and having an architecture that's open and able to work with different models for different scenarios.

AI Business: When you're bringing in the business side to look at how you're going to deploy and implement AI, what kind of KPIs do you use? Do you use business KPIs or do you use technical KPIs?

Hu: Both. But I do think you still need to start with a business outcome because otherwise you don't have a True North. Without a very crisp statement of the work to be done and the problem to be solved and the value that we are pursuing, then it's difficult. For example, we would like to increase our customer Net Promoter Score (NPS), after working with the service desk, by five points. That's very concrete; it's very easy to understand. At the same time, you do want the technical aspects. There are absolutely a set of (metrics) around performance. Is it stable? Is it consuming the target set of resources, not more, not less?

There's actually something else that’s really interesting. Because generative AI is so different in terms of capabilities to assist humans with knowledge-based work, I do think there is this notion of understanding what is the budget – how much does it cost in the equivalent of an augmented machine helper?

I say that because one of the things that's not as fully appreciated is the potential costs of this. There's been a lot a lot of press around how expensive it is to build a new foundation model like Anthropic’s (Claude and Claude 2), (Meta’s) Llama, or (OpenAI’s) ChatGPT. It takes however many thousands of GPU cores running for months, and millions of dollars of investment that go into training.

For scenarios where there are many users in parallel, inference can actually become quite expensive.

After it's trained, though, then the actual work of using them is – as you know, it's called inference − way cheaper than it costs to train the model. But for scenarios where there are many users in parallel, inference can actually become quite expensive. So even if it is an orders of magnitude less expensive to run a single inference operation, … if you're going to have thousands or maybe tens of thousands of people who are in your service network, consumers who are trying to get an answer to a question, … it can actually be computationally quite expensive. So in terms of metrics, make sure you fully understand the implications of the technology that you're rolling out, because it could add up to considerable costs.

We may say, ‘how much does it cost to run the robot equivalent of this? And how does that compare with the having a person do this?’ It sounds silly because historically, the implicit leading assumption is, of course, if the computer can do it, it will be cheaper. So depending on how the cost curve changes – based on algorithmic effectiveness, future computing power and how many trillions of operations a second it can perform − it is something for companies to monitor because at scale the cost of some of these operations can be material.

AI Business: Do you think that costs are going to come down over time?

Hu: Absolutely. We're shifting from historically what's been CPU-centered compute in our cloud data centers into something that's going to be much more mixed or heterogeneous. You can see even in the roadmaps of the large silicon providers that they are pursuing CPUs, GPUs, neural processing units, data processing units or DPUs, so there's going to be increasing specialization to tailor the kind of underlying hardware and computing that is going to run this.

AI Business: One question before we dive into what Lenovo is doing with AI − can you give us a sense of how companies from different countries and regions view AI? Are they as excited as people in North America and Europe?

Hu: Yes, and I'll preface that by saying that I spend a lot of time in North America, Europe and Asia. … The business community, globally, from the discussions I've had in the Americas, in Europe and in Asia, are pretty uniformly interested. They understand that generative AI is something that's going to be quite focused and it's going to be quite impactful. There's a general sense also that this is a fundamental technology that will continue to reshape how companies operate, and maybe even reshape how jobs are structured.

AI Business: So turning to Lenovo, can you share with us a use case using either generative AI or just AI in general from both your company and your clients?

Hu: There are so many but I'll pick a quality (inspection) example. … In factories, quality inspection historically has been very manual intensive. You have someone who's actually checking the output coming off the line. Because of its repetitive nature, this lends itself very well to both machine learning as well as computer vision. It can identify samples that are positive (for defects) and negative samples that are free of any defects, and you tell the machine learning model to (approve the latter) products. For defective products, we want to flag those for rework to make sure they don't make their way into the hands of customers.

We've been very successful in actually using computer vision and machine learning models to automate our quality monitoring. That's actually also helped us reduce the costs to serve our customers because we can reduce the number of customers getting defective products and that's going to be a significant positive impact.

The interesting thing is it also restructures the nature of work in a very good way.  Imagine you're on a quality (inspection) shift and you’re looking at an assembly line for four hours. It's really hard work to stay focused on a single task for so long. By augmenting it with AI, we've been able to upgrade those skills and restructure that job towards thinking about how to take the quality data to do the job with the machine instead − so I think that's had a very positive impact.

Stay updated. Subscribe to the AI Business newsletter.

The other aspect is actually an example of a closed loop. You need to be able to design in quality from the start, rather than just check for quality at the end. … Because of the significant amounts of data that are coming off our assembly lines − we're shipping many devices every second, that's Lenovo scale − we're also able to understand what design points and choices are more likely to lead to reliable, easy-to-assemble, no-defect products, as well as (spot) products that are more complicated or more prone to assembly and manufacturing defects. Using machine learning on that stream of data then becomes a valuable source of input for our design teams who are building the next generation of products.

With our customers, I'll use an example from the services space. The service desk is one of the offerings that the Solution and Services Group offers to our customers. We are currently enhancing our service desk to infuse that with generative AI. We've already seen in proofs of concept that we're trying out with our customers – and also within Lenovo − that it is leading to increases in understanding of the intent (of the end user). The generative AI-enhanced chatbots are able to grasp (the intent of people who ask them questions to give them the right answer). The follow-on effect is that it tends to make our internal users as well as the customers we're testing it with have higher customer satisfaction. You feel you had the proper interaction to achieve your goal.

AI Business: What are some challenges you encountered deploying these proofs of concept? And what advice would you pass along to other people in your shoes?

Hu: On proofs of concept, scoping is important. You don't want to bite off more than you can chew because it's important that proofs of concept deliver something quickly, so that you can take advantage of the time you're harnessing the business side’s attention. If you have a proof of concept that drags on for a half year or nine months without a clear output, it's easy for the business side to lose interest. … Pick a timeline within a quarter, whether that's two months or six weeks, but really force the pace to have something that shows this could be useful − with a preliminary result one way or the other. This has been helpful to keep people interested and focused in the proper way.

The second part is understanding the limits of the technology. For example, there are some issues that generative AI can't fix. So in order to have a good proof of concept, you want to spend time making sure you also have the prerequisites. In some parts of our PoCs around improving our service desk, we found that part of the reason we weren't getting the ideal results was we needed to enhance some of our underlying knowledge base to bring the right context or history into the discussion so the model could be properly tuned. You can train a chatbot all you want, but if it fundamentally doesn't have the domain-specific knowledge needed to meaningfully engage with users, you're not going to get the right outcome.

It's important to check not just the scope, but also whether you have the other necessary ingredients. Technology is a tool and it definitely suffers from a garbage in-garbage out phenomenon. You have to make sure you're giving it context and the right domain-specific knowledge for the scenario you're looking to accomplish.

AI Business: What are some common mistakes that you've seen?

Hu: (One would be) to start investigating a technology but without a focal point towards a business scenario. The other point is communication. I’ve seen that people perhaps get overexcited about the technology, and they don’t build the bridge so that the business side can walk across it. You're a techie and you're really interested in the new tech stack: What are the latest advances? How exactly are we compressing algorithms? What's the coolest vector database that you can use for embeddings?  

I've seen technologists get so excited because there's a lot going on, and it is genuinely exciting. But they lose the plot with the business side, (helping them) understand what the technology means to the company. Third, it is getting too hung up on any particular solution. Teams need to be prepared to experiment and to figure out how to evolve their solutions if they want to continue to take advantage of new innovations as these are happening so quickly.

AI Business: I want to talk about Lenovo research as well as any partnerships you may have with other research institutions.

Hu: One of our focus areas in the research space is to create domain-specific models. … There's a lot of work to be done on how to take these general foundation models and make them much more specific and also smaller. You can think of it in two dimensions. One is adding industry specific and domain specific knowledge, and the second one is making them smaller so that we can make them much more applicable.

We offer the industry's most comprehensive client, edge server, storage devices portfolio. But as you know, some of the largest models are too big to run in anything but the cloud or some really large computing devices. As we make advances to make them smaller, it really democratizes the technology because you can start thinking, ‘well, maybe I don't need an 8-way server or a huge rack to go run this. Maybe I can run it on a smaller, light edge server.’ And as computing gets even better, maybe we can put it on a PC and even some version of it on a phone. So one is domain specificity, the other is democratization.

AI Business: I want to ask you a question that's been bugging the AI community, actually, globally. Where do you stand in terms of AI being an existential threat to humanity?

Hu: I would say that while foundation models and large language models are absolutely a huge leap forward in terms of capability, the science-fiction, AGI artificial general intelligence that’s (akin to) my brain in a box, I think is still quite far away. Humanity has enough challenges on its plate to focus on. If we continue to use the current technology for augmentation, there's a lot of runway to go before we should start worrying about some of the more existential questions

Read more about:

ChatGPT / Generative AI

About the Author(s)

Deborah Yao

Editor

Deborah Yao runs the day-to-day operations of AI Business. She is a Stanford grad who has worked at Amazon, Wharton School and Associated Press.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like