The enterprise AI event for data science & IT leaders
Join us at Rev, where innovators from leading organizations share how they're driving results across industries.
Operationalizing NVIDIA Nemotron's safety suite at enterprise scale
Agentic AI represents a fundamental shift in how organizations deploy artificial intelligence. Unlike traditional AI that responds to prompts, agentic systems pursue goals autonomously: they plan multi-step workflows, call tools to access data and systems, make decisions, and adapt based on outcomes. This autonomy creates tremendous business value but also introduces new risks around safety, reliability, and accountability.
The challenge facing enterprise AI teams is clear: how do you move agentic AI from promising pilots to production systems that meet enterprise governance requirements? NVIDIA Nemotron provides purpose-built models, safety controls, and evaluation tools. Domino provides the enterprise platform that operationalizes these components at scale with enforceable governance across the full AI lifecycle.
Together, Domino and NVIDIA enable organizations to deploy agentic AI systems that deliver measurable business results while maintaining the governance, security, and compliance controls that regulated enterprises require.
Organizations experimenting with agentic AI face a common set of obstacles:
Safety and reliability. Autonomous agents that call APIs, query databases, and make decisions need guardrails that prevent harmful outputs, stay within defined boundaries, and maintain consistent behavior across edge cases.
Governance at scale. Traditional governance approaches built for predictive models don't address agentic systems. Enterprises need policy frameworks that cover tool authorization, multi-agent coordination, human oversight triggers, and continuous monitoring.
Reproducibility and auditability. Regulated industries require complete lineage from data sources through model decisions to business outcomes. Agentic systems with multi-step reasoning and tool calls create complex audit trails that must remain accessible and interpretable.
On-premises deployment with data sovereignty. Many organizations cannot send sensitive data to external APIs. They need the full agentic AI stack running on infrastructure they control with consistent performance and no per-query costs.
Solving these challenges requires tight integration between AI safety components and enterprise MLOps infrastructure. NVIDIA Nemotron provides the safety building blocks. Domino provides the platform that makes them production-ready.
Domino recently demonstrated this integration in action at NVIDIA GTC, showcasing a multi-agent IT incident triage system that automates classification, diagnosis, and resolution workflows. Running entirely on NVIDIA Nemotron Nano 8B served through Domino's platform, the system illustrated three key capabilities that enterprise agentic AI requires. First, multi-agent coordination where specialized agents handle routing, diagnostic analysis, and resolution planning in sequence. Second, defense-in-depth safety using NVIDIA NeMo Guardrails for pattern-based filtering, Nemotron Safety Guard for detecting unsafe content and jailbreak attempts, and systematic validation via NeMo Evaluator and garak (Safety Auditor). Third, Domino's built-in governance where automated checks, deployment gates, and approval workflows enforce safety requirements at every stage from development to production.

Attendees were particularly struck by Domino's unified audit trail that captures everything from terminal logs to file views, agent tracing, and production usage, providing complete lineage and the ability to prove exactly how a model behaved at any point in time. This reproducibility resonated strongly with enterprise teams in highly regulated industries such as financial services, public sector, and life sciences who need defense-in-depth security at scale. Many appreciated Domino's tech-agnostic approach to bring your own infrastructure, models, code, and tools into a controlled, compliant environment rather than vendor lock-in. Domino's focus on the hardest challenge in enterprise AI also generated excitement: getting models actually used by the business through governance, system-of-record capabilities, and IT controls. Without model adoption, there is no business value, and that's where the combination of NVIDIA Nemotron's safety components with Domino's governance platform delivers, particularly as security emerged as GTC's dominant theme.
These insights reinforced a key finding: production agentic AI demands integrated platform infrastructure, not point solutions. This reference implementation demonstrates how organizations can move from agentic AI experiments to production systems with enterprise-grade governance built in from the start. The architecture patterns and governance frameworks showcased at GTC apply across agentic AI use cases, from customer service automation to supply chain optimization.
NVIDIA Nemotron provides a complete toolkit for building, evaluating, and safeguarding agentic AI systems. Each component addresses a specific challenge in the agentic AI lifecycle.
NVIDIA Nemotron models are designed specifically for agentic AI use cases. The GTC demonstration used Nemotron Nano 8B to show how, unlike general-purpose language models, Nemotron includes optimizations for tool calling, structured output generation, and extended context windows necessary for multi-step reasoning. Organizations can run production agentic AI workloads entirely on infrastructure they control in Domino, maintaining complete data sovereignty while achieving performance comparable to larger cloud-only models.
NVIDIA provides layered safety controls that work together to prevent harmful outputs and keep agents within defined boundaries.
NVIDIA NeMo Guardrails provides the orchestration layer for enforcing safety policies across agentic AI systems. The framework combines deterministic pattern-based rules with LLM-based semantic checks, creating a flexible but reliable control plane. The Colang framework enables declarative safety rules that execute with zero latency for high-confidence controls, while LLM-based semantic checks handle ambiguous cases that fixed patterns cannot anticipate. Runtime orchestration manages the complete safety pipeline: input validation, model inference, output filtering, and logging.
NVIDIA Nemotron Safety Guard is a purpose-built safety classifier separate from the main inference model. This separation allows organizations to update safety policies without retraining production models, and safety checks run in parallel with minimal latency impact. The model covers content safety, jailbreak detection, and topic control across multiple languages, identifying unsafe requests before they reach the main agent and blocking harmful outputs before they reach users.
NeMo Guardrails configurations and Safety Guard deploy as versioned artifacts in Domino environments. Teams can test safety controls in development, promote them through staging, and deploy to production with the same governance controls that apply to models themselves. Safety Guard runs as a separate Domino-hosted endpoint that scales independently from inference workloads.
Rigorous evaluation ensures agentic AI systems meet quality and security standards before production deployment.
NVIDIA NeMo Evaluator provides systematic evaluation across multiple dimensions: accuracy, completeness, actionability, and policy compliance. The service supports both automated benchmarks and LLM-as-judge techniques for open-ended agentic outputs where traditional metrics fall short.
NVIDIA Safety Auditor (garak) provides systematic adversarial testing with carefully crafted prompts across multiple risk categories including prompt injection, jailbreak techniques, unsafe content generation, and information disclosure. Automated vulnerability scanning runs the complete test suite and produces detailed reports showing which attacks succeeded, which were blocked, and where safety controls need strengthening.
In Domino, NeMo Evaluator runs as a Job, with results stored in experiment tracking. Quality scores are stored alongside agent configurations, and governance policies read them automatically during approval workflows. Safety Auditor runs as a scripted check within Domino governance workflows, with audit results attaching automatically to governance bundles as evidence and deployment gates enforcing minimum security standards before production.
NVIDIA Nemotron delivers best-in-class agentic AI components for inference, safety, and evaluation. Domino provides the enterprise platform infrastructure that operationalizes these components at scale, turning AI primitives into governed, production-ready systems.
Agentic AI safety depends on consistent, repeatable evaluation. When safety benchmarks produce different results across runs, teams lose confidence in their controls. When evaluation environments drift between development and production, passing tests in staging doesn't guarantee production safety.
Domino solves this with versioned, containerized environments that preserve complete configuration. Software dependencies, package versions, hardware specifications, datasets, and ground truth labels remain identical across every run. This means NeMo Evaluator assessments, Safety Auditor scans, and guardrails tests produce consistent results whether executed today or months later for audit. Organizations can trust that development testing accurately predicts production behavior.
Domino Jobs orchestrate the compute-intensive work of safety validation at scale. Organizations can run safety evaluation suites in parallel across multiple configurations, schedule recurring audits before each deployment, and scale evaluation infrastructure independently from production inference. Automatic resource allocation ensures safety evaluation never competes with production workloads while maintaining cost control.
Traditional AI governance relies on documentation, manual reviews, and trust that teams followed procedures. This approach breaks down for agentic AI systems where safety requirements are complex, interdependent, and must be verified programmatically.
Domino governance encodes safety requirements as automated policies with risk-based controls. Not all agentic AI systems carry the same risk. Agents with PII access face stricter requirements than informational agents. Low-risk agents move quickly with automated safety checks and single approver sign-off. High-risk agents face rigorous controls including external security audit, red-team testing, and executive approval. Organizations define what evidence each stage requires, which checks run automatically, and what approvals gate deployment.
Automated enforcement ensures safety requirements become enforceable policy, not recommendations. Scripted checks execute Safety Auditor scans and attach results as bundle evidence automatically. Metrics checks read NeMo Evaluator scores directly from registered model metadata without manual data entry. Deployment gates enforce approval requirements at infrastructure level: GPU endpoints require safety approval, production apps require deployment readiness approval, high-risk agents require executive sign-off.
Complete audit trails capture every decision with full provenance. Domino logs every experiment run, model registration, safety check result, and deployment event. Every NVIDIA model endpoint call, NeMo Guardrails evaluation, and Safety Guard classification flows to a central audit store. Regulators can trace any production decision back to the specific model version, safety configuration, and approval chain that authorized it, satisfying enterprise compliance requirements without manual record-keeping.
Production agentic AI systems require observability beyond traditional model monitoring. Teams need visibility into which tools agents called, what intermediate reasoning steps occurred, where guardrails triggered, and how often safety controls intervened.
Domino tracing instruments every agent action, tool call, and safety check. Guardrails trigger events log with full context about what was blocked and why. Aggregated dashboards surface patterns: most common guardrails triggers, Safety Guard block rates, tool call frequency, and latency distribution. Alert workflows notify teams when safety metrics degrade.
Beyond observability, production operations require secure, centralized infrastructure management. Domino provides centralized endpoint management with token refresh, access controls, and version tracking for hosting multiple NVIDIA model endpoints. Teams configure NVIDIA vLLM endpoints once, set permissions by user group and project, and let Domino handle authentication refresh without manual key rotation. This eliminates the operational burden of managing model endpoint credentials across development, staging, and production environments.
Organizations successfully deploying agentic AI with NVIDIA Nemotron and Domino follow three core principles:
Safety is a system property, not a model property. No single safety component prevents all risks. Effective agentic AI safety comes from defense-in-depth architecture where multiple controls work together. NVIDIA Nemotron provides the safety building blocks: pattern-based guardrails, content safety classification, and systematic testing. Domino orchestrates these components into a complete safety system where guardrails execute in the correct order, Safety Guard runs in parallel with minimal latency, and production monitoring detects when controls degrade.
Governance must be automated and enforceable. Manual governance processes don't scale to the complexity and iteration speed of agentic AI development. NVIDIA provides the safety evaluation tools that produce verifiable evidence of safety rather than self-reported claims. Domino makes governance enforceable: scripted checks run automatically, deployment gates block non-compliant systems at infrastructure level, approval workflows capture evidence with full audit trails, and monitoring alerts teams when production behavior drifts from safety baselines.
Local deployment enables innovation without compromise. Organizations don't need to choose between AI capability and data sovereignty. NVIDIA Nemotron models deliver production-grade performance in form factors suitable for on-premises deployment, providing complete data sovereignty with no external API dependencies, no per-query costs regardless of usage volume, and freedom to fine-tune and customize models for specific domains. Domino manages local NVIDIA model deployments with versioned configurations, access controls, usage monitoring, and automated scaling.
The gap between agentic AI pilots and production deployments is operational readiness, not technical capability. NVIDIA Nemotron provides the AI components. Domino provides the platform that makes them production-ready with enforceable governance, reproducible evaluation, and complete observability.
The best way to understand how Domino operationalizes NVIDIA Nemotron for enterprise agentic AI is to see the integration working end-to-end: from agent development through safety validation, governance approval, and production deployment.
Request a demo to explore how NVIDIA's safety suite integrates with Domino's governance platform, see reference architectures for common agentic AI use cases, and discuss how this approach adapts to your organization's specific requirements and risk profile.

is the Product Marketing Director for Data Science, AI, and ML at Domino Data Lab, where she drives go-to-market strategy and technical content for the platform. Over six years at Domino, she has worked across training, sales engineering, product, and customer success, building a deep understanding of what it actually takes to deploy AI in regulated industries. Before entering tech, she was a neuroscientist turned data scientist.
Join us at Rev, where innovators from leading organizations share how they're driving results across industries.
Join us at Rev, where innovators from leading organizations share how they're driving results across industries.