by Jelani Harper 3 April 2020
Machine learning is bringing sophisticated pattern recognition capabilities mainstream. Advanced machine learning deployments can detect non-linear patterns, patterns with a staggering amount of variables, and patterns occurring over lengthy periods of time that are too difficult for humans to discern.
When augmented with innovations in time series analysis, pattern recognition in fundamental enterprise use cases like fraud detection, security analytics, and churn reduction can be enhanced to deliver more detailed results.
Organizations can optimize their ability to perform time series analysis with a number of data management staples including data modeling, data visualizations, and knowledge graph technologies. This combination enables businesses to analyze their data so they can “go forwards and backwards in time,” according to Jans Aasman, CEO at Franz. “This is super powerful, as you can imagine. You can look at relationships in your data as they evolve over time.”
This temporal dimension is crucial for enhancing machine learning analytics to recognize fraud, cyber security attacks, and many other use cases.
The capacity of time series analysis to traverse datasets over time and visualize how people, events, and actions might have impacted business objectives is due to flexible data modeling. Standardized data models that naturally evolve to incorporate new data sources are vital to this task, especially when they utilize an event-based schema. This approach enables users to model virtually any occurrence relative to business objectives with a start time and end time.
When keeping track of various entities for fraud detection in financial services, for example, events—and the relationships between them—might include anything related to companies, people, and their actions. “You owned a company from a start time to an end time, or you worked for a company from a particular start time to an end time, or you were at a particular address, or had a telephone number, or had an email address with a particular start time and end time,” Aasman said. “We have relationships and every relationship has a beginning, and most of the time, an end.”
A popular approach to modeling data points with start times and end times is through a graph environment, primed for determining relationships between nodes. Graphs with standardized data models can align data of almost any variation in a way that lends itself to visualizations. This capability is integral in financial services because “if you are a tax organization, for example, you want to look at fraud,” Aasman said. “Every time you look at fraud, you’re looking at social networks of companies and people. You want to know what was the influencer network of a company, or the influencer network of a person.”
Moreover, graph technologies support analytics options and query types that aren’t possible—or too difficult to implement—in other environments. For instance, graph approaches support clustering techniques which are used in unsupervised learning. Competitive options in this space enable organizations to query data via graphs through an intuitive, visual interface that automates the requisite code. This enables organizations to ask questions specific to their business objectives.
In the aforementioned fraud detection use case, for example, users can simply manipulate the graph visually to determine “is there a person with this name that performs a role with respect to company C1, where company C1 performs a role of full stock owner of company C2 and company C2 plays a role as full stock owner of company C3,” Aasman explained.
The advantage of visualizations offered by progressive graph solutions is the capability to “go back and see how the graph got built up over time,” Aasman said. Without such visualizations, users would have to look at the individual dates attributed to nodes, a laborious and complicated process. The synthesis of event-based data modeling, knowledge graph technologies, and data visualizations now enables such undertakings to be completed with a click of the mouse. Organizations can simply select any node (a person or event) and see how they’re related to all the others on the graph—and do so in sequential order to ascertain how these actions impacted others.
This granular visibility can improve machine learning models used to solve common problems like fraud detection, security analytics, customer churning, and more. It provides those models with additional, detailed data informed by the added dimension of time, which is critical for determining the causes and effects in most machine learning deployments.
Jelani Harper is an editorial consultant servicing the information technology market, specializing in data-driven applications focused on semantic technologies, data governance and analytics.