Is Deep Learning too superficial?

Ciarán Daly

September 27, 2018

5 Min Read

by Jans Aasman

OAKLAND, CA - Deep learning has been broadly acclaimed for its advanced pattern recognition abilities that are primed for working with data at scale. It can detect non-linear patterns with quantities of variables that make it difficult, if not impossible, for humans to do so. Deep Learning’s strengths are recognizing patterns and translating them into predictions with high accuracy rates.

The Deep Learning Dilemma

Critiques of Deep Learning take issue with how its achievements are produced. Deep Learning needs an inordinate amount of training data to generate reliable results. Moreover, a significant portion of its training data requires annotations for labeled outputs of the model’s objectives. Although there are methods for reducing the data requirements of Deep Learning, (such as transfer learning), there are a number of tasks when such massive amounts of labeled training data just aren’t available.

An even greater limitation is Deep Learning models don’t necessarily understand what they’re analyzing. They can recognize patterns ideal for Natural Language Processing or image recognition systems, but are somewhat restricted in their ability to understand a pattern’s significance and, to a lesser extent, draw inferences from them.

In a widely read article published early this year on arXiv.org, a site for scientific papers, Gary Marcus, a professor at New York University, posed the question: “Is Deep Learning approaching a wall?” He wrote, “As is so often the case, the patterns extracted by deep learning are more superficial than they initially appear.”

One of the fathers of cognitive science, Noam Chomsky, while speaking at a MIT symposium in May of last year on Brains, Minds and Machines, critiqued the field of AI for the field's heavy use of statistical techniques to pick regularities in masses of data. Chomsky’s view is that using statistical learning techniques to better mine and predict data— is unlikely to yield general principles about the nature of intelligent beings or about cognition.

Augmenting Deep Learning

Despite the media’s current preoccupation with Deep Learning, machine intelligence has always involved a multiplicity of technologies and techniques. Amalgamating methods produces the greatest business impact with approaches both old and new, celebrated and less so. A mix of machine learning (statistical and pattern matching), Behavorism (stimulus response) and rules-based analytics (symbolic approach) offers the most robust method for analyzing complex data.

For example, many organizations are aware of Deep Learning’s propensity for sophisticated pattern recognition, which some consider the summit of Artificial Intelligence. Far fewer are familiar with the learning capabilities of Prolog, a coding language from the latter part of the 20thcentury with its own capacity for machine intelligence.

By aligning these techniques in a semantic standards-based environment, organizations can counteract their weaknesses, augment their strengths, and solve a complexity of data analytics challenges that would otherwise elude them—or require more resources than they have for sustainable value.

The Prolog Parallel

The intelligent inferences of Prolog directly address this issue, as systems leveraging this language specialize in more cognitive tasks associated with reasoning and understanding. Deep Learning is a statistical methodology for clearly ordered tasks, whereas Prolog is designed to understand and create inferencesof the data it analyzes. Thus, Prolog systems are well suited for less defined jobs that may require modest judgment calls or outputs that aren’t necessarily categorical. Prolog also requires only a modicum of the training data Deep Learning’s neural networks need to address a specific use case.

The difference in these two approaches is fairly profound. Both Deep Learning and Prolog based systems can be trained to read financial reports, for example. Deep Learning will require much more training data and simply understand the semantics of the words, while Prolog will need substantially less data to understand the document’s conceptual ideas. There’s also a rapidity associated with Prolog that perhaps begins with its smaller training data sets, yet extends to its analytic deployments.

For instance, Prolog systems can quickly analyze data at scale for stock market or trading commodities, then highlight concepts for analysts to focus on, greatly decreasing the time analysts would’ve spent reading these materials themselves. By analyzing content quickly and indefatigably, these systems analyze much more content than humans could.

A Paired Approach

Deep Learning and Prolog take distinctive approaches to solving business problems. Inlinked data settings empowered by semantic technologies, their synthesis pairs sophisticated pattern recognition capabilities with an innate understanding of business concepts. The potential for exploiting this duality is remarkable, particularly when leveraged for database level inferences in a triple store. Deep Learning algorithms can draw parallels between different types, structures, and sources of data for contextualized understanding of relationships prior to performing analytics for a specific use case, such as clinical trials in the pharmaceutical industry.

Deep Learning could identify which data is relevant for specific trials, as well as what features of that data make them so. Prolog mechanisms could then make intelligent inferences about how that data applies to a specific pharmaceutical, or others sharing similar attributes. By actually understanding the significance of the data in relation to desired business objectives, Prolog measures could indicate which elements to focus on for a specific trial. The combination of these approaches in a scalable data store delivers granular knowledge of how data relates to each other and to domain goals.

Composite Learning

Deep Learning, Prolog, or any other AI tool should always be viewed in relation to how they impact one another. There will never be a single method required for the cognitive capabilities that maximize the yield of data practices. Instead, the composite of these approaches redresses their limits and extends their benefits for a more flexible, comprehensive understanding of data’s relation to tasks. By combining Prolog with Deep Learning, those tasks can be both loosely or specifically defined.

HPIM1773_Jans_Aasman.jpg

 

Dr. Jans Aasman is a Ph.D. psychologist and expert in cognitive science as well as CEO of Franz Inc., an early innovator in artificial intelligence and provider of semantic graph databases and analytics.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like