April 4, 2017
We had the opportunity to talk to VMware’s chief research officer, David Tennenhouse, about his views on artificial intelligence, how it is being used within VMware and how the new technology is helping customers engaged in digital transformation.
VMware is currently one of the global leaders in cloud infrastructure and digital workspace technology. We got to sit down with their chief research officer, David Tennenhouse, to hear his views on AI and how VMware’s enterprise customers can derive the most benefit from it.
“At VMware, we’ve explored a number of applications of machine learning within our products,” said Tennenhouse. “For example, LogInsight is inherently a big data product, with machine learning at its core”. VMware has also worked to embed machine learning within its larger systems that manage servers and whole data centres, such as vSphere and vRealize, in order to improve functions such as resource allocation, dynamic scaling and hybrid cloud operations.
Looking forward, VMware researchers and developers are looking for ways to analyse configuration data so that they can remotely help customers diagnose and address configuration problems. Tennenhouse was keen to stress that this activity is still in the research phase and they will approach data collection with a degree of caution. “We’ve always been very serious about the privacy of our customers’ information and what we collect and how we use it,” he said.
Tennenhouse also told us that his research team is looking above the infrastructure layer to better understand the problems customers are trying to address and find ways to help them achieve their business goals. One of his key observations is that there is a world of difference between the AI needs of most enterprise customers and the needs of the hyper-scale consumer facing companies whose AI-based technologies have become all the rage.
“One of the things that my team and I have been looking at - and this is early days - is something I refer to as ‘big data for the 99%’, and by that we mean 99% of enterprises. The consumer-facing folks, such as Google and Facebook, have been using ML – in conjunction with enormous numbers of servers and volumes of data -- to address a fairly narrow range of problems, such as search indexing, photo recognition and social network graphs.” He went on to talk about how he and VMware want to do the opposite.
“The vast bulk of enterprises have a broader range of problems that can be addressed by AI, yet those problems require far fewer resources. These customers are looking to work with data sets that are orders of magnitudes larger than they have worked with before, but they are not Google scale datasets, nor are they looking to use hundreds of thousands of servers – it’s more like hundreds or a few thousands. That causes one to wonder whether the impressive AI algorithms and big data frameworks that have been pioneered for hyper-scale are ‘right sized’ for the 99%. Can we find alternatives that are better, faster, cheaper?”
He continued, “Certainly if you start looking say at the FTSE companies, very few of those have problems that need to scale-out across hundreds of thousands of servers. Very few of those are going to benefit from hyperscale AI.”
Tennenhouse then started to discuss what he believes to be one of the key areas of machine learning that is not getting sufficient attention - front end tools, especially those for domain experts. “One of the key areas our researchers think has been under-served is the front end of this whole process.”
“Technologists who are passionate about scaling up their favourite ML model sometimes just cut to the part of the problem they like to work on, by assuming their model is the right one for the problem at hand”. In the real world, though, there is an important first step. “You need to find someone with sufficient domain and analysis expertise – and equip them with great tools to look at samples of the data. For example, they need to ask ‘Is there any signal in this data at all, or is this just all noise?’ Once the analyst is convinced there is a signal they can then determine what should be done in order to cleanse and normalise the data, which ML model is most appropriate, etc.
He continued, “I’m a great believer that you can almost always extract some value out of data but it could well be that, in your collection mechanism, you lost track of the exact format of the data. There is signal, but we need to know a little more about the data collection in order to find it. I hate to admit this, but we have had cases in looking at our own data, where this first step revealed deficiencies in the data collection” he said.
“We think the front end that helps people through that first step is a neglected area. It’s the sort of thing you don’t really need if your goal is to use AI to address a handful of problems at hyper-scale, yet it’s absolutely critical to the 99% of enterprises who will apply AI to a wide range of business challenges. Jumping to the assumption that a specific ML approach, such as deep neural nets, or a scale-out framework that has been optimised for hyper-scale applications should be used just doesn’t make sense,” concluded Tennenhouse.
Tennenhouse’s education and career have been focused on building highly-scalable systems. But today, his focus has changed. “At MIT and DARPA I was always challenging folks to build systems at ever larger scales, so it is humbling to be in a position where I’m saying, ‘You know I think the research community has overshot the mark a little on scaling’, and that we might need to take a bit of a breather and focus our energy on how on to make big data accessible to a wider range of problem cases” he said.
Another enterprise concern that Tennenhouse believes is going to be critical to many applications is something he called, “Explainable AI”. “One of the issues people have always had with neural nets,” he continued, “is that they usually get the right answer but we don’t know how and why they get there, i.e., we don’t have convincing explanations for their answers”.
Here again, Tennenhouse has a personal connection to the issue. “Early in my career I learned that insights and intuition will only take you so far; in order to have broader impact I needed to work backwards from my intuitive answers and come up with the “chain of reasoning” as to why they were correct. Absent that explanation I couldn’t convince my colleagues in the research community or the executives of large companies that my intuition was correct. AI faces a similar problem” he detailed.
Knowing how and why their AI models come to conclusions will be vital for many enterprises, according to Tennenhouse. “When enterprises get into more serious stuff than consumer photo recognition, say when they are making decisions about whether or not to extend credit to an individual, they or their regulators will need to have an explanation as to ‘Why did I conclude this person should be offered credit and this other person shouldn’t?’; Similar concerns will apply if you are basing ‘make or break’ decisions on the results of an AI model” he explained.
“DARPA recently started a research program on Explainable AI that could break new ground of enormous importance to both governments and enterprises. I think the need for Explainable AI and having chains of reasoning is going to become more and more pressing” he said.
He further voiced the opinion that there may be many cases where enterprises will opt for better understood ML techniques rather than those that are the most scalable. “At the end of the day enterprises may forsake deep neural nets and end up applying older techniques that are better understood, if they turn out to be more explanation-friendly and scale sufficiently for the problem at hand” he said.
Tennenhouse also thinks that today’s enormous focus on AI and ML suggests that we may be missing other types of big data opportunities. For example, one of VMware’s researchers has set records in an area known as SAT-solving, which has to do with the satisfiability of systems of constraints. Like ML, this is an area of computer science in which dramatic progress with respect to both algorithms and systems has been game-changing. Researchers can now build SAT solvers that find solutions to problems with millions of constraints.
Today, these are largely used to address formal methods problems related to software and hardware verification, but Tennenhouse is convinced there must be a wide range of business problems that they can be applied to. “Anytime you see dramatic progress with a tool that addresses as fundamental a problem as SAT, which is NP-Complete, you just know in your gut that there must be a whole bunch of really important business problems that it can also be applied to. It also makes me wonder what other types of business problems ought to be re-visited in light of algorithmic advances”.
Security is another area that Tennenhouse views as ripe for research reinvention. “AI will also have a role to play in improving security but we also need to look for systemic approaches to cyber-defence. VMware and its partners have been leveraging its virtualisation expertise, both with respect to the hypervisor and the network, to create new capabilities such as VM introspection and network micro-segmentation.
Tennenhouse went on to describe the newly formed NSF/VMware Partnership on Software Defined Infrastructure as a Foundation for Clean-Slate Computing Security. The program is inspiring the research community to invent new ways of leveraging virtualisation to secure cloud computing. “We have put our money where our mouth is by partnering with the U.S. National Science Foundation on the design of the program and in jointly soliciting proposals from academic researchers” he said. “It’s hard to imagine that AI will not be integral to some of the new defensive capabilities that will emerge,” he continued.
But what about other areas, such as IoT? Tennenhouse spoke of the history and current state of IoT: “Although many enterprises have conducted pilots demonstrating the value that can be extracted from IoT-based data, there is little experience in operating IoT systems at scale and in a secure manner”.
“Although we are just as excited about the AI implications of IoT as everyone else, our initial focus is on a potential show-stopper that folks have been neglecting, i.e., securing and managing their IoT infrastructure. Our AirWatch team has been doing this for BYOD mobile devices for a number of years and our vROPS product has been instrumental in monitoring and optimising the operation of the large numbers of “things” within data centres. As our customers take their IoT applications from trials to large scale deployments, we are leveraging that expertise and will be there to help them secure and manage their IoT infrastructure,” he said.
We finished off our conversation with VMware’s chief of research by getting into the subject of start-ups and Tennenhouse detailed how he felt they would fit into the AI ecosystem and help customers to solve their problems. “I used to be a VC so I absolutely love start-ups and see them as complementary to our research; Since start-ups need to be very focussed, many of them are looking at only one or two parts of a larger problem, such as one specific security challenge. One of our roles as a research team is to identify the larger system architecture so that we can synthesise our own research with the work of start-ups and our ecosystem partners into comprehensive solutions that address the needs of our customers,” he finished.