Big Data, Little Chef: Why it pays to understand your DataOps team
by Max Smolaks
by Shivnath Babu, Unravel Data
31 March 2019
Business leaders rely on business intelligence supplied by their staff, sometimes data analysts or scientists, and their technical teams who ensure the analytical workflow keeps flowing.
Given the centrality of data to the whole process, it’s arguably no longer enough to allow the technical team to look after this without business leadership having some understanding of what DataOps entails. This only becomes more key as AI becomes increasingly relied on to make recommendations that must be sound and safe for society.
Chefs, Refrigerators and Ovens
Before breaking down how leadership in an organization can create a culture that supports the developments of business intelligence from data insight, it is instructive to look at how this ecosystem is structured overall.
One way to simplify this is dividing it into three components: The Chef, the Refrigerator, and the Oven. By this classification, the chef is responsible for design and metamanagement. He is the mind governing the kitchen, deciding what food is bought and by what means it should be delivered. Fundamentally, the chef is usually not responsible for the actual cooking. If we bring this metaphor back to the world of data, the chef would be the one making design-time decisions and managing what is happening in the kitchen.
The next component of the big data kitchen is ‘the refrigerator’, where food is stored. Fridges are designed in such a way that each compartment is optimized for the food item that will be stored there: fruit, meat, liquids, etc. As we have already outlined, the chef is the brains of the kitchen but they cannot personally store the food. Instead, they choose how the food will be stored and ensure that the right infrastructure is in place to store it. But, the chef shouldn’t be redesigning a new refrigerator every day. This should be carefully designed once and ready for a broad range of scenarios. The job of the fridge is comparable to the data storage layer in our data ecosystem: keep the data safe and optimized for access when needed.
Finally, we have the oven. In our kitchen metaphor, the oven deals with access and processing. After we have successfully stored food in the fridge, it is now time for the oven. The oven is the tool in which food from the fridge is processed to make quality meals while producing value. In other words, the oven is an example of the processing layer in the data ecosystem, like SQL; Extract, Transform, and Load (ETL) tools; and schedulers.
Building the Kitchen
So with this in mind, how can senior decision-makers support the big data kitchen? By making sense of the chef’s role, the demands of their jobs and what the biggest bottlenecks are, we can arm them with the best tools possible to support their work. For instance, design time and metadata management are all the rage now in the data ecosystem world for two main reasons; they reduce time to value. In the context of our kitchen analogy, if we provide the chef with the storage space he needs to ‘manage’ where everything is in the kitchen, he can cook faster and keep the space cleaner. However, what we’re seeing recently is that AI can also offer support. By intelligently recognizing where bottlenecks are forming the data process, recommendations can be given to help remediate them faster. In some cases, automated actions can be taken that resolve issues before they’re even identified by the business. To see what this would look like in practice, let us look at a few of the chef’s areas of responsibilities and functions.
Creating and managing datasets/tables - The chef is responsible for the definition of fields, partitioning rules, indexes, and such. Normally they offer a declarative way to define, tag, label, and describe datasets.
Discovering datasets/tables - For datasets that enter your data ecosystem without being declaratively defined, someone needs to determine what they are and how they fit in with the rest of the ecosystem. This is normally called scraping or curling the data ecosystem to find signs of new datasets.
Auditing - They are also responsible for finding out how data entered the ecosystem, how it was accessed, and which datasets were sources for newer datasets. In short, auditing is the story of how data came to be and how it is used.
Security - Normally, defining security sits at the chef’s level of control. However, the implementation of security is normally implemented in either the refrigerator or the oven. The chef is the one who must not only give and control the rules of security, but must also have full access to know the existing securities given.
With this in mind, what kind of kitchen should business leaders be building for their chefs? A good place to start is ensuring that the chef knows what the end product looks like. However, it can be difficult to do this on a shared platform where a number of different organizations normally have competing objectives. The chef may get confused and create a product that satisfies neither party. Equally, sometimes the auditing of workloads can generate rules that act as bottlenecks.
To navigate these obstacles, and provide chefs with the tools they need to work best, top-tier decision makers need to provide their teams with clear end goals and listen to their feedback. What is working and what is not? What tools do they need? AI can go a long way in answering these questions, but it is still the role of the human leader to put its recommendations into practice. In summary, the best way to support your team is to understand their objectives and how they go about reaching them. In the long run, as you pay them attention, it will itself pay dividends.
Shivnath Babu is co-founder and CTO of Unravel Data, the company developing a data operations platform that leverages AI, machine learning and advanced analytics