Financial Services AI in the Hybrid/Multi-Cloud: Harness Data Gravity with Hybrid MLOps
By David Schulman2022-09-2812 min read
How do you balance the need for innovation and efficiency in FSI with the need for compliance and good governance? That's a critical question for the financial services and insurance industries to answer, because AI, ML, and the hybrid/multi-cloud are among the most important technology trends of our time.
They are already revolutionizing the way we live and work, and their potential is only just starting to be realized in FSI.According to a recent study by Bloomberg, the global artificial intelligence (AI) in the financial services and insurance market is anticipated to generate revenue of $84.3 billion and to rise at a CAGR of 21.4%, from 2021 to 2028. Currently, financial services firms use AI for a competitive advantage by augmenting human decision-making, making it faster, more accurate, and more efficient.
However, firms hoping to capitalize on these two trends often find they need to balance their competitive drive to innovate with the need for governance and compliance. Too much innovation can lead to chaos, while too much governance can stifle innovation. The key is to find the right balance, with enough governance to ensure stability and accountability, but also enough innovation to drive model-driven competitive advantage.
A recent Ventana Research white paper, Top 5 AI Considerations for Chief Data & Analytics Executives, notes a similar challenge. While the benefits of AI/ML are broadly appreciated with “97% of financial services organizations report[ing] AI/ML is important or very important for detecting and preventing fraud,” eight in ten organizations across industries also noted the importance of governing AI/ML, highlighting the difficult equilibrium between innovation and governance facing data and analytics executives.
This isn’t necessarily black and white, as headlines often report that market leaders in their respective categories fail to find this balance. For example, in 2017 a data breach at Equifax led to a $700 million settlement, and more recently a “glitch” produced incorrect credit score reporting for consumers - something that data science industry experts speculate may have been caused by AI model ‘data drift,’ as reported by VentureBeat. Domino’s Thomas Robinson said there's a great need for good monitoring in critical fields:
“Playtime is over for data science. More specifically, for organizations that create products with models that are making decisions impacting people’s financial lives, health outcomes and privacy, it is now irresponsible for those models not to be paired with appropriate monitoring and controls.”
Data and analytics executives at financial services firms are in the challenging position of extracting value from data by proving out data science initiatives - all while balancing governance to make sure their firms don’t end up in the headlines for the wrong reasons.
The Rise of the Analytics Center of Excellence to Govern Data Science
To better govern data science in a hybrid cloud environment, many financial services and insurance companies are turning to a “Data Science Center of Excellence (CoE)” strategy to drive innovative best practices and governance of data science initiatives across the enterprise.
Effective governance requires a single-pane-of-glass display of all AI/ML activity, and the financial services firms creating a model-driven competitive advantage are implementing various forms of an “Analytics/AI/ML Centers of Excellence (CoE)” to drive AI/ML adoption. DBRS deploys Domino as a central platform for all data projects groups including Finance, Research, and Model Governance. Similarly, S&P Global uses Domino to act as the “glue” underpinning programs to educate 17,000 distributed employees on data science.
CoE strategies are also common in insurance. Allstate’s Analytics Center of Excellence helps them put models at the heart of its business, leveraging data as a transformational enterprise asset. New York Life’s Center for Data Science and Artificial Learning is also fully immersed in the core of the company’s business, enabling real-time, model-based decision making across the company. Topdanmark's Machine Learning Center of Excellence infuses models throughout operations for fraud detection, policy approvals, and risk management - all to improve the customer experience. SCOR’s Data Science Center of excellence has significantly furthered the adoption of data science and helped business units develop models to address customer needs 75% faster than they previously could.
Managing the Proliferation of Distributed, Sensitive Data
Data is the driver of effective machine learning initiatives, and sensitive, distributed data across environments (multi-cloud, hybrid cloud and on-premises) introduces unique challenges for financial services data and analytics executives centralizing and governing AI/ML in CoE strategies.
A recent white paper from Ventana Research,Top 5 AI Considerations for Chief Data and Analytics Executives, notes that 73% of organizations report that disparate data sources and systems represent the greatest challenge when implementing data science governance policies, exacerbated by data scientists requiring multitudes of data from beyond the company firewall.
Financial services firms require data from branches, online, and mobile devices. Acquisitions and mergers also expand data footprints, with data often distributed across cloud regions, providers, and on-premises data sources. 32% of organizations report using more than 20 data sources, while 58% self-report as using “big data” - with petabyte size databases becoming more common, according to the same Ventana Research white paper.
Additionally, a recent Wall Street Journal article noted that Healthcare Insurance firm Anthem generates 1.5 to 2 petabytes of synthetic data aimed at training models to detect fraud and delivering personalized service to members, highlighting the “bigness” of this distributed data. Ventana Research notes that “by 2025, more than three-quarters of enterprises will have data spread across multiple cloud providers and on-premises data centers, requiring investment in data management products that span multiple locations.”
This reflects similar patterns Domino has observed in our own clients, combining data sources with advanced machine learning techniques for a model-driven competitive advantage. At Coatue, the proliferation of alternative data sources and new computational techniques has created demand to identify signals that can inform investment decisions. Moody’s Analytics has established a competitive advantage by creating analytics based on unique financial data sets and applying them to solve clients’ business challenges.
Harnessing Data Gravity with a Hybrid/Multi-Cloud Strategy
When it comes to big data, especially big “distributed” data, data gravity is one of the most important concepts to understand. Data gravity is the tendency of big data to pull other data, systems, and services/applications towards it. Because AI and data science require a lot of data, it is important to have a strategy in place for collecting, storing, processing, and analyzing this data.
Performance is just one important data gravity consideration - model latency can be reduced by co-locating data and compute. While managed cloud databases offer some promise in managing this proliferation of data, ingress and egress costs can dramatically hinder data science efforts - extracting and analyzing a petabyte of data could cost as much as $50,000, according to Ventana Research.
Beyond performance considerations, data residency and data sovereignty regulations further complicate data collection and processing, often locking data sets in a single geography, cloud, or cloud region. The Digital Operational Resilience Act (DORA) in the EU “aims to ensure that all participants in the financial system have the necessary safeguards in place to mitigate cyber-attacks and other risks,” with specific requirements around vendor concentration and vendor lock-in. More specifically for AI/ML, GDPR has provisions around automated decision-making and profiling, highlighting the importance of hybrid and multi-cloud configurations to ensure appropriate data management and geofencing. According to Ventana Research:
“By 2026, nearly all multinational organizations will invest in local data processing infrastructure and services to mitigate against the risks associated with data transfer."
Taking into account data transfer costs, security, regulations, and performance, it often makes more sense to design AI/ML platforms for flexibility, offering choice to co-locate the compute behind data processing with data itself as opposed to transferring or replicating data sets across clouds or geographies. Put in the context of Analytics CoE strategies requiring a holistic, enterprise-wide view of data science initiatives for governance, data and analytics executives face difficult decisions in building out their data and analytics infrastructure.
Data Science Platform Considerations for Hybrid/Multi-Cloud MLOps
Financial services and insurance companies that are looking to adopt AI platforms should focus on platforms that support their hybrid enterprise IT strategy. This strategy involves balancing openness, agility, and flexible compute power in order to future-proof the business. Platforms that offer these features will be able to help businesses make quick decisions, respond quickly to changes in the market, and adapt quickly to new technologies or legislation.
Domino Data Lab is at the frontier of hybrid support for AI workloads with our recent Nexus Hybrid Cloud Data Science Platform announcement. A true hybrid data science platform enables data scientists to access data, compute resources, and code in every environment where the company operates, in a secure and well-governed fashion. Our deep collaboration with NVIDIA and support for the broader data and analytics ecosystem provides data and analytics executives the confidence they need to foster AI/ML innovation, while still providing the flexibility required for enterprise-wide governance.
Ventana Research emphasizes the importance of an open and flexible data science platform;“future-proofing your data science practice in the face of evolving hybrid strategies, ever-changing data science innovations; and maximizing value from purpose-built AI infrastructure, on-premises or in the cloud.”
To learn more about the top considerations for scaling AI in the enterprise, check out the new Ventana Research white paper commissioned by Domino and NVIDIA.
David Schulman is a data and analytics ecosystem enthusiast in Seattle, WA. As Head of Partner Marketing at Domino Data Lab, he works with closely with other industry-leading ecosystem partners on joint solution development and go-to-market efforts. Prior to Domino, David lead Technology Partner marketing at Tableau, and spent years as a consultant defining partner program strategy and execution for clients around the world.
Subscribe to the Domino Newsletter
Receive data science tips and tutorials from leading Data Science leaders, right to your inbox.