LLMs help banks capitalize on their most valuable asset: Unstructured data
Mike Upchurch2025-08-28 | 9 min read

Banks are incredibly data-rich: From the structured data housed in their transactional systems and customer, marketing, and operational databases, to the massive amounts of unstructured data spanning customer emails, PDFs, loan applications, and more. Until recently, banks have struggled to truly derive value from their structured data due to the limitations of traditional natural language processing (NLP) models and the fact that it can’t easily be queried, analyzed, and actioned into business processes.
Gleaning value from unstructured data has long been an unfulfilled opportunity challenge, despite the fact that it accounts for 80% of all data within financial institutions. In many cases, the cost to extract value has exceeded the expected return on investment because models simply could not derive meaning from unstructured formats at scale — until now.
How LLMs create new opportunities for financial institutions
The advent and rapid adoption of large language models (LLMs) has opened up a whole new world of possibilities when it comes to how banks leverage and make sense of their unstructured data. With LLMs, banks can uncover valuable insights and eliminate many of the manual analysis methods preventing them from taking their business to the next level.
For example, in commercial lending, key insights are often buried within borrowers’ financial documents, tax forms, loan documents, and corporate filings. Historically, banks would have analysts review these documents page-by-page and manually enter the data into relational databases to garner insights. But doing this is costly, prone to error, and very expensive to scale. Due to these issues, only a fraction of the data in loan documents is recorded and available for modeling. Attempts to automate insight gathering via topic modelling and categorization also yielded less than valuable results.
Now, LLMs can do things like rapidly analyze borrower documents to automate risk assessment, alert banks to early signs of default risk, and identify complex relationships and exposure across counterparties. This allows banks to improve risk management and loan pricing, reduces underwriting time from weeks to hours, and simultaneously improves consistency and compliance. It’s clear that LLMs can help banks unlock tremendous value from their unstructured data, but in this highly regulated industry, barriers to adoption persist. Let’s examine what’s standing in the way of banks realizing the full value of their unstructured data, and what’s possible when those barriers are removed.
Know the roadblocks to tapping the value of unstructured data
One of the most significant barriers to deriving value from unstructured data is its fragmented nature. Unstructured data resides in silos spread across disparate legacy systems, file shares, email servers, and various other locations. This makes its collection and organization exponentially more difficult than its structured counterpart. Unified ingestion and retrieval are also a challenge due to this fragmentation; and because time spent organizing unstructured data pales in comparison to the amount of work done on structured data in financial institutions.
To illustrate the difficulties this causes, imagine a bank responding to a lawsuit where the opposing attorney has asked for all of the plaintiff's interactions across the institution — in any form — related to the issue. For the bank to comply, it would need to compile and thoroughly comb through all of the plaintiff’s chat logs, emails, call transcripts and recordings, account data, web forms, photos (e.g., ID verification documents), PDFs, etc. Doing this manually and with traditional machine learning (ML) models requires a huge amount of time and resources, whereas LLMs can vastly improve accuracy and reduce time and costs.
Understand new types of risks prior to LLM adoption
While the potential of LLMs is enormous, banks are ambitious, but cautious when leveraging them because they bring an entirely new class of risks. Risk and compliance teams need a thorough understanding of these new types of risk before approving adoption. In addition to data security and privacy concerns, banks worry LLMs could potentially misinterpret critical documents, or worse generate inaccurate or fabricated information if they hallucinate. Additionally, model risk management (MRM) is more challenging with unstructured data and rapid model evolution. Unlike structured data and stable models, where shifts are easier to monitor, the complexity and variability of unstructured inputs and LLMs make it harder to detect drift or degradation in performance.
Additionally, fragmented ownership of unstructured data across a bank’s lines of business hinders LLM adoption. Document-driven processes, such as loan origination, KYC, and fraud detection involve multiple departments, each with their own tools, workflows, and priorities. This fragmentation makes it difficult to coordinate LLM-driven initiatives, standardize data access, and align on success metrics. Without clear ownership or a way to effectively collaborate cross-functionally, banks’ efforts to deploy LLMs will stall.
How to safely use LLMs to unlock the value of unstructured data
With LLMs and a unified model development, AIOps, and MRM platform, banks can quickly and easily traverse the vast landscape of their unstructured data and conduct cross-source reasoning. Unlike traditional models that work at the paragraph or sentence level (and therefore lack context), LLMs let banks discover connections that exist across disparate documents. In addition, LLMs with multimodal capability can support the analysis of audio, images, and video. With these abilities, LLMs can conduct tasks like comparing terms across multiple contracts, or finding inconsistencies in borrower filings fast and at scale.
LLMs can also uncover hidden or unexpected patterns in unstructured data. For example, an LLM might reveal that a bank needs to reassess the accessibility of its app for its elderly customers after analyzing thousands of support calls. Unlike older models that were limited to identifying obvious or expected themes, LLMs can pick up on subtle, evolving issues based on context and language. Banks can also leverage LLMs to find potential compliance issues as they arise. Without LLMs, there is often a lag in determining that a set of customer problems are not just operational issues, but are instead compliance issues. LLMs can identify these issues early, reducing exposure and minimizing delays in remediation.
Eliminate time-consuming prep work and minimize overall costs
LLMs can also bring conversational interfaces to a bank’s legacy documents to make them easy to navigate. Banks can create chatbots or AI assistants that let employees ask questions about documents — like policy manuals or underwriting guidelines, for example without the need to reformat or structure them first. This eliminates time-consuming prep work, such as creating search indexes or entering data into relational databases, saving time and money. For instance, a bank employee could ask, “What’s the policy for commercial loan extensions from 2018?” and instantly get an answer from a 200-page PDF sitting untouched in a shared drive, versus trying to locate that information manually.
Deploy LLMs safely and responsibly
LLMs are quickly becoming indispensable for banks to tap the value of an incredible, but underutilized asset: unstructured data. While hurdles like regulatory complexity, model risk, and data fragmentation still remain, they can be overcome with unified model development, AIOps, and MRM solutions that enable innovation, collaboration, scale, and governance across the solution lifecycle. Deploying LLMs safely and responsibly is an aspirational top priority for every bank that wishes to stay competitive, unlock new levels of efficiency, and gain deeper insights into their business. Doing so in a well-managed, efficient, scalable, and risk-controlled way is challenging, but is absolutely possible to achieve.
Mike Upchurch is the Vice President of Strategy for Financial Services at Domino Data Lab, bringing over 25 years of expertise in analytics, ML/AI, business strategy, and technology. Previously, Mike held roles at Capital One as a product manager in their innovation lab and as a strategy and operations consultant in their Center for Machine Learning. Mike led strategy at Notch and in the mortgage lending group of Bank of America and was the co-founder of Fuzzy Logix. Prior to that he developed deep hands-on technical experience at The Hunter Group and PwC.
Summary
- How LLMs create new opportunities for financial institutions
- Know the roadblocks to tapping the value of unstructured data
- Understand new types of risks prior to LLM adoption
- How to safely use LLMs to unlock the value of unstructured data
- Eliminate time-consuming prep work and minimize overall costs
- Deploy LLMs safely and responsibly


