Squaring Sustainability With Data Availability in the Age of AI
Previously offline data can be used to fuel and train AI models but needs need the right data storage technologies
As we come to grips with what AI technology means, one possibly shocking truth is emerging: The power of AI isn’t servers or even GPUs — it’s data. Data shapes algorithms that lead to breakthrough services and insights. Yet to gain value from data and take advantage of new opportunities offered by AI, the way you store data and make it available to your users must change. Data must be stored sustainably and economically but it also must always be easily available. A new multi-storage technology approach based on erasure-encoded object storage can reconcile these so-far contradictory goals.
The new possibilities of AI offer staggering new opportunities and are fueling a new technology race in every industry. The previously dormant, offline data that many organizations have been storing, often for decades, can now be used to fuel and train AI models and spur new rounds of value extraction. However, for this to succeed, organizations need the right data storage technologies that are high-performing, sustainable, cost-effective and make data easily available for the unique needs of these new data workflows.
Tape Offers Advantages in Terms of Sustainability and Cost
In the past, it was assumed that most unstructured data would never be needed after its first use — or if it was, only very rarely. Accordingly, storing such seldom-used data in a long-term or “cold” archive made sense. The storage medium of choice for this was typically tape due to its low cost, durability and sustainability. What’s more, inexpensive tape sets could then be removed and stored off-site. However, this strategy is not feasible where archived data needs to be accessed and reused frequently and this is especially true when sifting through and organizing data for AI model training.
In the Age of AI, Cold Data Needs to Be Readily Available
AI models are trained on large amounts of unstructured data. The more data that can be fed into training and the better you can tag it and prepare it for training, the more incisive and useful the results will be. And this is no one-time proposition — successive training runs and the introduction of new, changed data can grow AI capability in leaps and bounds. Furthermore, if a company wants a differentiated advantage with their AI, organizations must rely on a data infrastructure that allows them to retain and use their own unique data to grow precise and powerful capabilities tailored to their unique competitive environment.
In this new vision of data workflows, it’s clear that legacy cold storage approaches and offline data will not work. To extract the most value from unstructured data, cold data must be as easily available as active data.
Object Storage on Tape Provides an Active Archive for Quick Access to Data
To square these seemingly contradictory goals of sustainability, low cost and exceptional data availability that can scale to enormous levels, a suitable storage strategy in the age of AI must offer several characteristics. It requires a solution that offers multiple storage technologies, first placing data into a high-performance tier, like erasure encoded flash or hard drives, to be accessed and worked on. Yet in the same namespace transparently move the data into a cold, highly economical and resilient tier.
An S3-compatible object storage system with both of these storage technologies — with erasure encoding (flash/HDD and tape) — can be the breakthrough solution to offer high performance, scale to hyperscale level as needed and make the data easily available while allowing organizations to hit sustainability and storage cost targets. This type of solution gives organizations every benefit of building their own private cloud, which can be easily and seamlessly expanded up to exabytes and beyond.
As every industry moves to build its AI data pipelines and infrastructure, it is clear that carefully organized and tagged data is a critical resource that can unlock tremendous new insights and capabilities not possible before — but these workflows must be optimized for sustainability and cost-effectiveness to be truly successful. For companies developing solutions in verticals such as life and earth sciences, media production, manufacturing or government, object storage that spans both ‘hot’ and ‘cold’ storage in a single solution delivers a sustainable, cost-effective, secure and robust way to achieve evolving data storage goals in the era of AI.
About the Author
You May Also Like