By Jelani Harper
Contrary to popular belief, cloud computing is not a panacea for contemporary data management. Cloud success hinges on how deployments are implemented. In this respect, the cloud contains as many potential dangers as it does advantages—if not more.
Many users are aware of the benefits of public cloud computing providers such as Microsoft Azure or Amazon Web Services, which routinely offer low pricing, global accessibility, and well-funded, layered security.
Far fewer, however, are cognizant of the perils of proprietary cloud deployments and their penchant for vendor lock-in. Consequently, proprietary cloud users—as well as those deploying certain third-party cloud management solutions in public clouds—may frequently encounter setbacks with regulatory compliance, switching providers, and legal issues.
According to Archive360 Marketing VP Bill Tolson, “Many companies don’t take these questions into account and end up getting themselves into trouble that they weren’t aware of. But with U.S. courts especially, or even regulatory agencies, the assumption is whatever you did wrong, you did on purpose.”
Depending on which cloud services they choose, organizations may be liable for substantial costs associated with reconverting data, low throttling, metadata management, and their related legal or regulatory consequences. Successfully navigating these concerns, however, helps maximize cloud deployments.
The foremost danger associated with proprietary cloud vendors is the converting of an organization’s data into a proprietary format which oftentimes only has meaning within that provider’s setting. Unless users carefully read their Service Level Agreements, they might not even know their data will be converted into a different format. “Their excuse is, oh, it helps us store it,” Tolson said.
An alternative to converting data into another format is for vendors to containerize it. In both instances the data is no longer native to the organization which migrated it there. “They’ll create these containers with a bunch of files in it,” Tolson remarked. “But, the container itself is a file. So, they can give you that converted data but you won’t be able to use it unless it’s reconverted.” Thus, if organizations want to move that data to a different provider, they’re stuck with paying pricy re-conversion fees that function as deterrents.
Another method for limiting the portability of data once they’re in a proprietary cloud is what Tolson termed “throttling”. Throttling may also deter organizations from switching cloud providers, or serve as a means of providers to prolong service charges. “Throttling means they will dial down the amount, speed-wise, of the data that you can move out of the archive,” Tolson explained. “So normally, you could be moving 5, 10 terabytes per day out. What the vendor will do is maybe limit the output to 100 gigabytes per day so that it dramatically extends how long you have to remain in their cloud.”
Since providers charge users according to how long their data is in the cloud, the lengthy withdrawal period benefits the former as opposed to the latter. “We’ve had customers tell us it’s going to take us a year and a half to get the data out because they’re throttling it down to 100 gigabytes per day and [they] have three petabytes,” Tolson recounted.
Although the tandem of data conversion and throttling may hinder organizations attempting to switch cloud providers, some of the metadata ramifications may well spur them to do so. According to Tolson, some vendors may claim that an organization’s metadata is their own, simply because it’s in their proprietary format: “They can make copies and do what they want with that metadata, especially if you don’t know it.”
In other instances, the conversion of data into a proprietary format can cause organizations to lose valuable metadata. Perhaps the most pressing metadata concern relates to replicating it without the customer’s consent. Tolson mentioned he’s heard stories where “one of the cloud third party vendors was migrating customer data into their cloud and actually taking the associated metadata for their own use, marketing use and those sorts of things, and actually putting it into another cloud.”
Regulatory and Legal Adherence
Obviously, the indiscriminate dissemination of metadata has very real, undesirable regulatory and legal consequences. It all but multiplies the risk organizations have for the General Data Protection Regulation and other regulations. For GDPR purposes, it makes the third-party cloud vendor not just a data processor, but also a data collector. Moreover, the conversion of data into proprietary formats, or the omission of metadata when doing so, may also have negative effects on any legal issues organizations have regarding their data.
“If companies are moving data into a third party cloud and metadata is being affected, in certain circumstances that could be destruction of evidence,” Tolson maintained. “In U.S. courts especially, when the opposing council asks for data through e-discovery, the requirement is whatever format that data is stored in when that request is made, it has to be turned in in that format.” Thus, missing metadata or new file formats, let alone an inability to readily retrieve such data from the cloud because of low throttling, may be deemed destruction of evidence—leaving organizations legally liable. “The judge slaps an adverse inference on you and all of a sudden you’ve just lost a case because of data storage,” Tolson said.
The pricing benefits of public cloud providers aren’t always available in proprietary or third-party clouds. According to Tolson, the half a cent to three cents per gigabyte, per month users are charged for Azure may become “dollars per month per gigabyte” with proprietary cloud vendors. Moreover, these may charge organizations according to the number of users per month, in addition to the storage of data, which increases costs. Conversely, the cheap storage of public clouds is oftentimes more cost effective overall, particularly given potential geographic replications organizations may need.
The implementation of cloud computing deployments makes all the difference in the underlying value they deliver the enterprise. Organizations selecting this option must consider the sundry of factors associated with pricing, metadata management, file conversions, throttling, vendor lock-in, and regulatory or legal responsibilities. Most of all, it’s necessary to consider these factors in relation to the type of cloud chosen.
Jelani Harper is an editorial consultant servicing the information technology market, specializing in data-driven applications focused on semantic technologies, data governance and analytics.