The enterprise platform to build, deliver, and govern AI
Watch the 15 minute on-demand demo to get an overview of the Domino Enterprise AI Platform.
How to build enterprise AI governance that enforces itself
Most data science teams have a governance program. Few have governance that actually runs at deployment time. An AI governance framework is the combination of policies, technical controls, and organizational accountability structures that govern how models are built, validated, deployed, and monitored. The distinction that matters is whether those controls are enforced in the workflow or documented in a SharePoint folder. That is where models actually get built and shipped.
This guide is written for data science leaders and risk officers who need to build or mature governance programs covering both traditional ML models and generative AI systems. Every organization faces real compliance pressure: financial services firms operate under SR 11-7 validation requirements and the 2026 SR 26-2 for AI; life sciences organizations need audit-grade traceability for FDA submissions; and regulated enterprises across sectors now contend with the EU AI Act's risk-based compliance requirements. What follows is a framework you can actually implement, not a conceptual model.
Most governance programs start with a policy document and end there. The document defines roles, establishes review committees, and references NIST AI RMF categories. Then it sits in a SharePoint folder while models get built and deployed outside any formal process.
The core failure mode is treating governance as a compliance exercise rather than an operational capability. Data science leaders who have shipped models at scale will recognize all three of the patterns that follow:
The practical fix: start with the controls you can enforce technically, including audit trails, model registration, environment reproducibility. Build the policy layer on top of what the platform can actually surface.
An enterprise AI governance framework has five functional areas: model lifecycle controls, risk tiering, access controls and data lineage, audit trails, and governance review gates. Organizations that try to govern AI without all five end up with gaps that surface during audits or in production failures.
Model lifecycle controls define the stages a model passes through from development to retirement, and what approval is required at each transition. The stages can include development, validation and testing, staging, production, monitoring, and deprecation. Each transition requires a documented decision: by whom, based on what evidence, and with what sign-off.
The governance mechanism is a model registry. Every model that reaches production should exist as a registered artifact in a system that tracks its version, training data, evaluation metrics, and approval chain. Without a registry, governance questions ("which version of this model is in production?") require manual archaeology through notebooks, email threads, and Slack history. When a validator or auditor asks that question during a review cycle, the absence of a registry means days of reconstruction effort and the real possibility that the answer is wrong.
Risk tiering assigns models to categories that determine the rigor of their validation and monitoring requirements. A credit scoring model at a bank and a customer churn model for internal analytics have different exposure profiles and should not have the same review requirements.
Regulatory frameworks like the EU AI Act offer a useful reference point: they classify systems into risk tiers based on potential harm, with proportional requirements at each tier. In practice, regulatory categories do not map cleanly onto internal model portfolios, and most organizations build their own tiering rubric. The most workable approach uses a two-axis model: business impact (revenue, regulatory exposure, reputational risk) against model criticality (how bad if it fails or drifts).
GenAI systems introduce a complication. The same foundation model can operate at different risk tiers depending on the use case. A summarization tool used internally is low-risk. The same model architecture deployed to generate patient-facing clinical summaries is high-risk. Risk tiering for GenAI needs to account for the deployment context and the population affected, not just the model architecture. In a model registry, this means that a single foundation model may appear at multiple risk tiers, each tied to a specific deployment and use-case definition. The model version and the deployment context together determine the applicable governance requirements.
Access controls determine who can see, modify, or run which models and datasets. This matters for both security and compliance: GDPR and CCPA restrict which datasets can be used for which purposes, and SR 11-7 requires that model validators be organizationally independent from model developers.
Data lineage tracks which datasets were used to train and validate each model version. Without it, the audit questions that matter most cannot be answered: was validation data independent from training data? Did the training set include individuals who have submitted deletion requests? Domino captures data lineage automatically at execution time, linked to the model artifact rather than maintained as a separate process.
An audit trail is the recorded history of a model's development and deployment lifecycle. Reproducibility means that any result, whether a training run, evaluation, or deployment configuration, can be recreated exactly from the recorded inputs. These two capabilities are inseparable for regulated enterprise AI.
Reproducibility is the foundation of model validation. If a validator cannot independently recreate a model's training run from the documented inputs, the validation is incomplete. In practice, this fails most often because environment configurations were not pinned at runtime, library versions have since changed, or the training data snapshot was not formally versioned. These are engineering problems, and they require engineering solutions. For FDA submissions in life sciences, this is a requirement. For SR 11-7 compliance in financial services, the same standard applies to model documentation.
Domino's approach is that every experiment automatically captures code version, data version, environment configuration, and results. The audit trail is a byproduct of how work gets done, not a separate process layered on top.
Ownership of AI governance in a large organization is not always clear. Data science teams own model quality. Legal and compliance teams own regulatory exposure. IT owns infrastructure and security. None of these roles translate cleanly into "owns governance."
One structure that works in practice includes a governance function that is organizationally separate from the data science team but has direct authority over deployment decisions for high-risk models. This function defines the standards, reviews the evidence, and has authority to delay or block a model launch. This model is standard in financial services (model risk management groups operate this way) and is being adopted in life sciences as AI initiatives in drug development scale.
Cross-functional governance requires explicit RACI ownership at each lifecycle stage. Who is responsible for initial risk classification? Who approves the model for staging? Who has final authority to approve production deployment? These cannot be answered generically. They need to be defined at the program level and enforced by the platform.
Governance gates are required approval checkpoints at model lifecycle transitions. They operationalize the RACI by making certain stage transitions impossible without explicit sign-off. This is the mechanism that turns policy into enforcement.
There are four gates every enterprise AI governance program should implement:
Platforms like Domino implement these gates as workflow controls, not suggestions. To be promoted models must complete a registration checklist. In Domino, the governance is built into the operational process.
A parallel problem is shadow AI, which includes models and GenAI tools running inside the enterprise outside any formal governance process. The governance response is platform centralization. When the governed platform is easier to use than workarounds, most practitioners will use it. Domino's centralized infrastructure model makes shadow AI significantly harder to sustain at scale with the ability to regularly audit compute usage and model deployments within the registry surface of the platform.
Generative AI introduces risk categories not included in traditional ML governance models. In traditional ML, model outputs are deterministic (or near-deterministic) and bounded by the training objective. In generative AI, outputs are probabilistic, open-ended, and shaped by prompts that change at runtime. This means governing at the use-case level, not just the model level. The same foundation model can present very different risk profiles depending on who uses it and for what.
The risk categories specific to GenAI systems are:
Prompt injection is particularly consequential in agentic deployments, where the model executes actions rather than returning text. Adversarial inputs can redirect model behavior away from its intended use in ways that are difficult to detect after the fact.
Hallucination risk is highly use-case dependent. Factually incorrect outputs presented with apparent confidence are a manageable nuisance in an internal summarization tool and a serious liability in clinical, legal, or financial decisioning contexts.
Output unpredictability makes deterministic testing difficult. The same model can produce meaningfully different outputs for semantically similar prompts, which means coverage-based validation approaches from traditional ML do not transfer cleanly.
Data leakage via prompts is a controls problem. Sensitive data entered in prompts may be retained in API logs, model context windows, or fine-tuning pipelines without explicit controls in place.
These risks add to traditional ML governance requirements. A GenAI system still requires drift monitoring, access controls, and version tracking.
Traditional ML
Generative AI
Agentic AI
Output type
Bounded (class, score, value)
Open-ended, prompt-shaped
Actions + outputs, multi-step
Primary failure modes
Model drift, data leakage, fairness violations
Hallucination, prompt injection, output variability
Action scope violations, compounding errors
Key governance controls
Model registry, validation, drift monitoring
Prompt logging, output monitoring, use-case risk tiering
Action scope definition, execution trace, rollback capability
Regulatory frameworks
SR 11-7, 21 CFR Part 11, EU AI Act
EU AI Act (high-risk classification), SR 26-2
Emerging; no settled standard yet
Governance layer
Foundation
Extends ML governance
Extends GenAI governance
Agentic AI systems are AI models that take sequences of actions, call external tools, and make decisions with limited human intervention. Governing them requires more than standard model lifecycle controls, because the action space is larger and consequences can compound across steps.
Four governance requirements apply specifically to agentic deployments:
Life sciences organizations face the most technically demanding AI governance requirements of any industry. FDA's 21 CFR Part 11 requires validated software with full audit trails for systems used in regulated processes. GxP compliance requires that the entire data lifecycle, from collection through analysis to reporting, be documented, reproducible, and auditable.
For AI models used in drug development, the governance requirements include documented and reproducible training and validation workflows, change control for any model update (a version change requires revalidation, not just redeployment), data provenance controls, and validation documentation sufficient for regulatory submission.
Domino's audit trail and reproducibility capabilities make validation documentation a reporting exercise. Every training run and evaluation is automatically logged and linked to the model artifact, so the evidence required for regulatory submission exists as a byproduct of how work gets done. Domino's life sciences platform is built around these requirements.
Financial services organizations have operated under formal model risk management requirements since the Federal Reserve's SR 11-7 guidance in 2011. SR 11-7 establishes three pillars: model development and implementation, independent model validation, and ongoing model governance. ML models present specific challenges here: conceptual soundness review requires explaining model logic to a validator, ongoing monitoring must include automated drift detection, and high-risk models in credit decisioning require regular audits against documented baselines. For a deeper foundation, Domino's model risk management solution covers the platform capabilities and regulatory requirements in detail.
The 2026 SR 26-2 guidance extends SR 11-7 to cover AI systems more explicitly, with heightened requirements for model explainability and human oversight in automated decisioning. What changes with SR 26-2 covers the specific implications. Moody's, using Domino's platform, achieved a 4x increase in model validation frequency, a direct outcome of reproducible, well-documented workflows that eliminated the reconstruction effort that had previously consumed validation team bandwidth. When the documentation artifacts are generated at execution time, the validation team spends its time on substantive review. Domino's banking and financial services platform is built for these compliance requirements.
Public sector and defense organizations face AI governance requirements shaped by federal policy. The NIST AI RMF is the primary voluntary framework for federal agencies, and Executive Order 14110 on AI safety established additional requirements for agencies developing or procuring high-impact AI systems. For organizations like the U.S. Department of the Treasury, which uses AI models in financial surveillance, sanctions screening, and economic analysis, governance requirements include strict auditability, model explainability for regulatory decisions, and data security controls that extend to classified and sensitive financial data.
The deployment constraint that defines public sector AI governance is environment: models often run in on-premises, GovCloud, or air-gapped environments. Governance infrastructure has to work across hybrid deployments without creating separate audit processes for each environment. Domino is deployed in DoD IL5 environments and supports hybrid and on-premises deployments, giving agencies a single governed platform regardless of where compute runs. Domino's public sector platform covers the specific deployment and compliance requirements for government agencies.
An AI governance framework built entirely on policy documents and manual review processes has a structural flaw: it depends on practitioners choosing to follow the process every time, for every model. At scale, with dozens of data scientists and hundreds of models in production, that assumption fails. Model delivery is an engineering problem. Accountability and control are a governance problem. They require the same underlying infrastructure.
The common objection is that disciplined teams can implement governance without a dedicated platform. This is true for a small portfolio of high-visibility models with stable ownership. It becomes operationally unsustainable when the model count reaches the dozens, ownership turns over, and the pressure to ship accelerates. The big question is whether governance is consistent.
Workflow-native governance changes the equation. The platform enforces documentation requirements before a model can be registered, so governance happens at the point of work. Audit trails are generated at execution time. Drift detection runs as a platform service, which means monitoring happens consistently across all deployed models rather than only for the ones someone remembered to instrument.
Organizations with standardized governance workflows get models to production faster, because the review process is efficient and documentation artifacts are generated automatically. Moody's 4x model validation frequency is the outcome of a governed platform. Broader AI adoption becomes sustainable when governance scales with it.
For enterprises expanding into generative AI and agentic systems, the infrastructure question becomes more urgent. AI-powered applications built on GenAI require prompt logging, output monitoring, and action auditing that are impractical to implement manually across dozens of deployments.
That infrastructure has to be embedded at the platform layer before agentic use cases can scale. Domino's AI governance capabilities, including model registry, automated audit trails, environment reproducibility, and integrated monitoring, are infrastructure-enforced controls. They work because they are built into how work gets done. Understanding what governed infrastructure makes possible for AI-powered applications is the right starting point for any organization evaluating its MLOps and governance readiness.
An AI governance framework for enterprise ML and GenAI is the combination of policies, technical controls, and organizational accountability structures that govern how AI models are built, validated, deployed, and monitored. For ML systems, this covers the full model lifecycle from development through retirement. For GenAI, it additionally covers prompt management, output monitoring, and the specific risks of probabilistic, open-ended model outputs, including hallucination, prompt injection, and data leakage. An effective framework is enforced at the platform level across all models and teams, regardless of who built them or how they were deployed.
ML governance and GenAI governance share the same foundational requirements: model registration, audit trails, lifecycle controls, access management, drift monitoring. They differ in the risk categories they address. ML models produce outputs from a defined class or value space; their failure modes, including model drift, data leakage, and fairness violations, are well-understood and largely covered by established frameworks like SR 11-7 and 21 CFR Part 11. GenAI systems produce open-ended outputs shaped by runtime prompts, introducing risks including prompt injection, hallucination, output variability, and data leakage through prompts. GenAI governance adds a layer on top of ML governance. The same traceability and lifecycle controls apply; the risk assessment criteria and monitoring requirements must account for the probabilistic, context-dependent nature of generative outputs.
The NIST AI Risk Management Framework provides a voluntary structure for identifying, assessing, and managing AI risks across four functions: Govern, Map, Measure, and Manage. It is not a compliance requirement in most contexts, unlike SR 11-7 for financial services or 21 CFR Part 11 for life sciences, but it provides a structured vocabulary for building an AI risk management framework and maps well onto the EU AI Act's risk-based classification approach. For enterprises operating across multiple regulatory regimes, it is a useful design reference, particularly for its emphasis on trustworthy AI characteristics: accuracy, reliability, explainability, fairness, privacy, security, and accountability.

Danny W. Stout, Ph.D, is a seasoned data science and analytics leader with over two decades of experience driving enterprise AI and machine learning initiatives. He held senior analytics and AI leadership roles across global organizations including Ernst & Young, Takeda, TIBCO, Quest, and Dell, spanning forecasting, pricing, analytics strategy, and data science consulting. His work emphasizes effectiveness over scale, focusing on governance, team alignment, and measurable outcomes as the determinants of successful AI adoption. Based in Charlton, MA, Danny holds a Ph.D. and combines technical leadership with practical insights that help organizations scale data science responsibly and effectively.
Watch the 15 minute on-demand demo to get an overview of the Domino Enterprise AI Platform.
In this article
Watch the 15 minute on-demand demo to get an overview of the Domino Enterprise AI Platform.