Financial services remain a hotbed for AI interest and investment, not least because of the exponential scale and complexity of the datasets deployed by the sector. As our recent piece with PwC demonstrated, financial organisations work primarily with data – from compliance processes to skills and expertise shortages, they must work with data.
Of course, one of the greatest challenges is infrastructure – these organisations need a means of managing and storing all that data, making hardware the bedrock of any successful data strategy. “Today’s networked world is creating incredible amounts of data at an ever-increasing rate. The challenge for the financial services industry is in how they can unlock the potential value in the data, making use of insight rather than intuition in decisionmaking to reduce costs and drive revenue generation,” says Alex McMullan, EMEA CTO of Pure Storage, a company providing businesses with a comprehensive, cloud-connected, end-to-end data platform that aims to drive business and IT transformation.
We caught up with Alex shortly after Pure Storage received the AIconics award for Best Innovation in AI Hardware to discuss the hardware and infrastructure challenges facing firms looking to implement AI in the finance space, as well as the necessity of cloud data platforms for the successful deployment of AI.
AI Will Make A Big Impact On Finance
One area of finance in which AI is already making an impact is quantitative investing. For example, Man AHL, a London-based pioneer in the field of systematic quantitative investing, uses Apache Spark and Pure Storage’s FlashBlade to create and execute computer models that make investment decisions. “Roughly 50 quantitative researchers and more than 60 technologists collaborated to formulate, develop, and drive new investment models and strategies that can be executed by computer,” McMullan explains. “They adopted FlashBlade to deliver the massive storage throughput and scalability required to meet its most demanding simulation applications. Improvements of up to 20x in throughput for Spark workloads have been possible, allowing the firm to gain a substantial time to market advantage.”
It’s telling of a greater capability McMullan highlights: that AI will enable the micro-analysis needed by financial organisations to proceed at a much higher rate and therefore will become an invaluable asset. “It will allow financial organisations to tap into the veritable goldmine of data they are currently doing nothing with, as well as enhancing customer security and reducing fraud by quickly finding outliers and unexpected patterns of use and expenditure.”
“By enabling the automated analysis of data, and using it to create consistent and accurate models and predictions, companies will be able to anticipate changes in the market in better ways than ever before, allowing them to operate more efficiently and ultimately, increase profit.”
Big Challenges Ahead For Successful AI Data Strategy
The financial sector is witnessing an AI arms race right now, as firms rush to implement effective AI data pipeline strategies. However, they face a number of big challenges along the way. “AI has huge potential to help banks make this transition and thrive, but building an AI data pipeline isn’t easy,” McMullan explains. He argues that two of the biggest obstacles traditional financial services providers must contend with is the need to digitize existing business processes, as well as provide instant delivery of secure personal data to any type of mobile device anywhere.
These are compounded by the varying requirements of each stage within the pipeline from the underlying storage architecture. These disparate requirements, then, demand consistent, scalable performance – something that legacy storage systems, McMullan says, simply cannot deliver. “For legacy storage systems, this is an impossible design point to meet, forcing the data architects to introduce complexity that just slows down the pace of development.
Hardware Is The Key – Cloud Storage Could Be The Answer
“To innovate and improve AI algorithms, storage must deliver uncompromised performance for all manner of access patterns – from small to large files, from random to sequential access patterns, from low to high concurrency, and with the ability to easily scale linearly and non-disruptively to grow capacity and performance.”
This is illustrated by another pitfall for financial organisations in the form of data storage bottlenecks. “Access to data is critical for those in the financial services industry but, as data sets grow larger and more complex, the ways in which companies choose to handle and utilise that data are becoming more and more important,” McMullan explains.
These changes to data, he argues, generate a pressing need for companies to update their approach to hardware. “Data is no longer larger, sequential, batched and fixed,” he explains. “Rather, it is poly-structured from multiple disparate sources so companies cannot continue to rely on legacy storage solutions which were designed to solve 1990s business problems.”
“High-performance, parallel, silicon-based storage is the way forward in terms of making the most out of this, as any bottleneck in data access will only lead to lost time and money.” One example is that of companies missing out on optimal currency exchange rates, which continue to represent a significant risk for any business with international clients or supply chains. McMullan argues that by being faster, more adaptive, and versatile, “flash data storage not only reduces delay between the data storage and the individual using the data, but in most cases will eliminate a bottleneck all together.”
The answer lies in the cloud – but, unless planned for and managed properly, the cloud could only be a partial answer. “Naturally, using cloud infrastructure means that you’re not dealing with legacy systems. However, using cloud doesn’t automatically mean that your AI data pipeline will perform optimally,” says McMullan. “Those in financial services, much like in other industries, need to ensure that they carefully consider the different options available as many cloud infrastructure offerings are not purpose-built to host an AI data pipeline and can suffer from latency issues just as much as legacy storage can.”
“Although having a public cloud solution to manage all of this data initially seems like a good solution, it does come with a performance and reliability downgrade when compared to the best on-premises solutions as the footprint expands. For anything of this scale, a hybrid cloud solution is often a better fit.”
Look At Hybrid Cloud Solutions For Scalable Success, Argues Pure Storage
Alex argues that a hybrid cloud storage solution – one which leverages both cloud and on premises storage platforms – will enable each different element of a system to play to its strengths. He offers the example of an on-premises solution capable of ingesting aggregated public cloud data, which could deliver optimal performance and reduce the threat of a disruptive outage which, in tech spaces like automated vehicles, could potentially be “devastating” in the future.
This solution, he argues, should ideally be supported by massively parallel silicon storage with the potential to provide the best performance, while also ensuring the regulatory requirements are still met. For an AI project to succeed it ultimately comes down to data sources, quality, and infrastructure. “A full-scale AI deployment must continuously collect, clean, transform, annotate, and store larger amounts of data. As new data is added, the infrastructure used needs to be scalable and flexible.”
This can be clearly illustrated by the datasets used to keep autonomous cars on the road, which enables them to analyse potential threats and then feed the information back to a central hub where other vehicles can learn from the experience. The problem, McMullan says, is that a car simply can’t store all this data – necessitating the hybrid approach. “While the autonomous vehicles of the near-future will make use of on-board flash storage, the cloud will play a huge part in the ways in which the vehicle accesses information regarding their surroundings and learns from other vehicle experiences,” he explains. ‘Using cloud computing to relay information such as maps, incident logging, and traffic will drive costs down and provide a more open and accessible bank of data.”
— Pure Storage (@PureStorage) September 28, 2017
Pure Storage At The AI Summit NYC
At the AI Summit NYC, Pure Storage will be showing organisations how they can get the most out of their data for AI and machine learning projects. They’ll be sharing with attendees how technology like their FlashBlade solution is enabling some truly incredible AI and machine learning projects around the world, such as innovators Zenuity.