Skip to main content
Home
Contact us
Watch Demo
  • Rev 2026
Contact us
Watch Demo
Domino's logo

Who is Domino?

Domino Data Lab empowers the largest AI-driven enterprises to build and operate AI at scale. Domino’s Enterprise AI Platform provides an integrated experience encompassing model development, MLOps, collaboration, and governance. With Domino, global enterprises can develop better medicines, grow more productive crops, develop more competitive products, and more. Founded in 2013, Domino is backed by Sequoia Capital, Coatue Management, NVIDIA, Snowflake, and other leading investors.

Watch Demo
  • Platform

      • AI infrastructure
      • Data management
      • AI workbench
      • MLOps
      • AI governance
      • FinOps
      • Pricing
      • Security & compliance
      • What's new
  • Solutions

    • Industries

      • Life sciences
      • Finance
      • Public sector
      • Retail
      • Manufacturing
    • Use Cases

      • Generative AI
      • Cost-effective data science
      • Self-service data science
      • Model risk management
      • Cloud data science
  • Learn

      • Events
      • Blog
      • Podcast
      • Courses and certifications
      • Data Science Dictionary
      • Documentation
      • Support
      • Demo hub
  • Company

      • About
      • Why Domino
      • Careers
      • News and press
      • Partners
      • Customers
      • Contact us

© 2026 Domino Data Lab, Inc. Made in San Francisco.

  • Do not sell my personal information
  • Privacy policy
  • Terms and conditions
  • Security
  • Legal
  • Agentic AI
  • AI Governance
  • Airflow
  • Anaconda
  • Apache Spark
  • Artificial Intelligence
  • Clustering
  • Dask
  • Data Science
  • Density-based clustering
  • dplyr
  • Factor analysis
  • Feature
  • Feature Engineering
  • Feature Extraction
  • Feature selection
  • Folium
  • GenomicRanges
  • ggmap
  • ggplot
  • GPU
  • Ground Truth
  • Hash table
  • Hyperparameter Tuning
  • Interpretability
  • Jupyter Notebook
  • Kubernetes
  • LLMOps
  • Machine Learning
  • Machine Learning Algorithms
  • MLOps
  • Model Drift
  • Model Evaluation
  • Model monitoring
  • Model Selection
  • Model Tuning
  • Overfitting
  • Plotly
  • PySpark
  • PyTorch
  • Responsible AI
  • Shiny (in R)
  • sklearn
  • spaCy
  • SR 26-2
  • Statistical Computing Environment (SCE)
  • TensorFlow
  • Underfitting
  • XGBoost
  • Ground Truth

    What is ground truth?

    In machine learning, the term ground truth refers to the reality you want to model with your supervised machine learning algorithm. Ground truth is also known as the target for training or validating the model with a labeled dataset. During inference, a classification model predicts a label, which can be compared with the ground truth label, if it is available.

    Developing ground truth datasets often require major tasks such as model design, data labeling, classifier design and training/testing. Ground truth labels for datasets are mostly annotated manually by a group of annotators and then later compared using different techniques to set target labels for the dataset. More substantial annotated datasets enable ground truth for supervised learning and deep learning algorithms to learn better patterns by increasing data variety.

    Defining a goal with your model

    It is the responsibility of humans to define the objective for the ground truth machine learning algorithm. In machine learning, the objective is always subjective. There are often disagreements between decision-makers when setting the objective, because in most cases there are no hard-and-fast rules to define the objective or ground truth label in all situations.

    All the individual attributes that can influence the predefined objective or target label are chosen as feature sets in the dataset. It is important to ensure that none of these features cause data leakage. Data leakages happen when a model learns a relationship between its target and some data that would not be normally available during inference. Data leakage can result in a model performing very well on the train and validation data but fails miserably in real world test data.

    Labeling ground truth data

    Once the training objectives are clearly defined, you need to get your data labeled accordingly. Several third-platforms provide data labeling services. Labelbox allows users to invite team members and collaborate over workflows, along with importing and exporting several different kinds of annotation formats. Some other popular platforms are Scale AI and Clarifai used for labeling computer vision, NLP, and audio data.

    Summary

    • Defining a goal with your model
    • Labeling ground truth data