The Problem With Data Science
The Problem With Data Science
November 2, 2017
In a year that has seen more data created than in the rest of human history, enterprises face a mindboggling number of challenges they must contend with in order to remain competitive. Although machine learning and AI technologies promise a solution, it's unclear where they can fit within the conventional approach to data science. All this could be about to change, however, thanks to new methodologies which automate the difficult, complex decisions that have traditionally slowed down enterprise decisionmaking.
TheDataTeam are a data strategy company with bases in both Singapore and India. Partnered with big names like Microsoft and Google, they believe they have the answer: a new, innovative form of data science that makes insights accessible, actionable, and most importantly, automated. RoboticDataScience aims to use AI to enable businesses and enterprises to not only place their data science strategy at the heart of their business plan, but use it as a basis on which the company can remain lean, flexible, and agile.
Rangarajan Vasudevan is Founder & Principal of TheDataTeam, focused on providing expert consulting services to companies in the use of big data and data science to grow their businesses using its pioneering methodology RoboticDataScience. He is a big data veteran with extensive experience in the design and implementation of data-driven strategies across industries in various geographies. We were lucky enough to catch up with him to find out more about the flaws and obstacles facing the conventional data science approach; the challenges facing enterprises in the next two to five years; and the new paradigm promised by RoboticDataScience.
Flaws And Challenges In Conventional Data Science Approaches
The conventional enterprise approach to data is outdated, argues Ranga: "Traditionally, data has been modeled to form a coherent view of the business. Insights are then extracted from this model by analysts (or data scientists) using specialized technologies, that are then separately interpreted for the business."
However, a fundamental problem with this approach arises, he explains, from the reliance of enterprises on scarce talent, which generates the need for bloated, inefficient task delegation. "Analysts with domain knowledge and deep technical expertise are a rare breed, so engineers are required to process and prepare the data for the analysts to work with. Decisionmakers then typically require an additional layer of translation for the analysts' outputs to make sense, and here, domain experts have to get involved," Ranga says. "The decisions they make trigger changes to business processes, which produce new data - and so, the cycle begins again."
Ranga believes that this approach has a number of significant drawbacks. As it requires anywhere from 4 to 7 teams to produce actionable insights in this way, it's an especially inefficient methodology - one which encourages silos of people and technology.
The efficiencies that smart data strategy is supposed to generate fall into the same old bureaucratic traps. This, he argues, makes governance "a nightmare". "Each team has its own rhythm and process of getting things done, which results in a lack of agility," he argues. "Each team works towards more specialization in its area of responsibility, at times procuring for itself the best technology it can lay its hands on. Within no time, every team has its own repository, analysis tool, and IT to manage the whole shebang. Most importantly, people do not work closely with each other - neither sharing knowledge nor cross-pollinating ideas."
Furthermore, this methodology reacts to change poorly, leaving the enterprise vulnerable to external changes - the very forces which ultimately shape businesses. "While this is eventually reflected in the data that is collected, the cycle is too slow to allow the optimal decision to be taken quickly. As analysts and data scientists quit to form new companies or join the next big VC-backed startup, there is also knowledge attrition." This, he argues, makes hiring and training new talent a challenge - reproducing the original problem.
Faced with these limitations, an enterprise tends to act tactically - a "dangerous thing to do", according to Ranga. "Business and IT point fingers at each other. IT tends to rationalize the silos, and mistakenly views the root cause as a 'data integration' problem to be fixed with the help of an integrated data platform," Ranga explains. "Such a view does not fix the people and process side of the problem, which the enterprise tends to realize after much expenditure." As a result, value creation for the enterprise becomes sidelined.
The missing link, Ranga argues, is the realization that business strategy must inform data strategy. "A data strategy helps the enterprise to build on its competitive advantage and to achieve its overall objectives," he explains. "So, collecting and storing data is not enough on its own. Procuring a data lake is not the end goal. Moving to the cloud does not necessarily bring more agility. Adopting AI might not even be the optimal thing to do within a line of business. The entire mechanism of how decisions are made for the business needs to be looked at holistically and re-thought end-to-end, instead of in a piecemeal fashion."
RoboticDataScience: A New Methodology for Data-Driven Decisionmaking
RoboticDataScience, Ranga claims, is a methodology for achieving just that. Billed as a means of automating data-driven decisionmaking, RDS "focuses on creating and contributing to business value in a timely fashion by adopting a domain-centric approach and using automation where possible," he explains.
"An enterprise should look inward to its own data and domain, adopt the innovations that are open and available, customize them for its own purposes, and change its business processes to benefit from automation and AI - and so avoid being burnt by the bleeding edge."
But what does this look like in practice? An enterprise using this methodology would begin by modelling the problem domain, rather than just the data in question. This includes factors such as business intuition, institutional knowledge, feedback from stakeholders, decisions to be taken, and actions to be performed.
"The domain model becomes the 'prized asset' from which every employee draws his / her work from, and contributes back to. Having thus modelled the domain, the challenging steps of preparing data and interpreting insights, which are usually very specific to an enterprise, is then automated."
It is these automation capabilities, Ranga argues, which break down barriers between personnel and reduces the latency associated with decisionmaking. This enables the enterprise to react faster to change and become more immune to skills and knowledge attrition.
Enterprise Priorities For Leveraging Data
With data volumes exploding and a dizzying array of new technologies emerging every day to take advantage of them, there are big challenges ahead for enterprises. "Technologies are changing even more rapidly than before, with new innovations being released in the open that encapsulate unimaginable amounts of complexity in easy-to-consumer APIs." Ranga believes that the only way an enterprise is going to benefit from these developments is to "automate fast and stay nimble, so as to adapt to change quickly."
"AI and cloud paradigms have played and will continue to play a crucial role, no doubt," he says. "At the same time, automation and AI need not trigger philosophical debates. There is never going to be enough analysts or data scientists to manually examine all this data to make sense of it. Modern technology platforms boasting of scalability and highly decoupled architectures help with only one part of the decisionmaking process. AI need not be merely the forte of the Internet-era behemoths, but can be grounded pragmatically in traditional enterprises too where data has always been collected."
"An enterprise should look inward to its own data and domain, adopt the innovations that are open and available, customize them for its own purposes, and change its business processes to benefit from automation and AI - and so avoid being burnt by the bleeding edge."
You can read more about TheDataTeam's work here.
About the Author
You May Also Like