Accelerate AI Development with Domino Feature Store
Tim Law2023-10-31 | 6 min read
Feature engineering is arguably one of the most critical and resource-intensive steps in the machine learning process. It can be plagued by inconsistent feature definitions, redundant feature building, lack of reusability, and difficulty deploying features in production. This adds time and cost to model development and serving.
Feature stores are an important innovation that addresses these issues. They have quickly become an integral component of AI and machine learning. They have become a central architectural component of the data fabric that supports both the development and operationalization of machine and deep learning models. Features stores enable sharing and reuse among data science teams and deliver other important benefits, including faster model development and inference, feature consistency, reusability, versioning, and governance.
Reuse and Share Features
A feature store’s primary use is to serve features to models both in real-time during an online prediction or offline during batch predictions or training. After validated raw data is pre-processed and converted into suitable features, a feature store can make the feature available to both training and inference pipelines.
Domino’s new feature store is now generally available and enables data scientists to securely create, publish, and share features so they can be discovered and reused. The Domino feature store acts as a central hub for feature data and metadata across an ML project’s life cycle, including:
- Model iteration, training, and debugging
- Feature discovery and sharing
- Production serving for inference
It delivers feature cataloging, search, and reuse by integrating and building on the Feast open-source feature store library. Feast is a code-first feature store that serves as the source of truth for feature definitions in the entities described in Python files. Domino has integrated Feast into the core platform and added some functionality for ease of installation, feature cataloging, search, and reuse. It offers a streamlined experience to review features and to share feature definitions and feature development code among users. It stores all the feature definition code in a Git repository.
Features computed in an existing feature engineering pipeline can be captured in the offline store and easily materialized in the online store. All credentialed users on a project can define and publish features for other Domino users to discover.
Faster Development and Deployment
Feature stores can speed model development by enabling the sharing and re-use of models while also streamlining model deployments into production. It can speed feature engineering and reduce redundant feature building by different departments.
Once features are published to the Domino Feature Store, they can be readily retrieved and provided to ML/DL or models for training or serving. Domino’s feature store empowers users to share features across different machine-learning pipelines, speeding up the development of new models. It enables both re-use and collaboration among data scientists in the model development stage.
Because a feature store creates a single repository of semantically defined features, data scientists can ship those features into production without requiring engineering, making them available for training and serving models. This eliminates friction in deploying models to production at a technical level and also enables better collaboration between data scientists and machine learning engineers. Finally, the feature store also ensures that features are always available in production.
Governance and Consistency
Often, different departments across an enterprise will have different definitions of features. The Domino Feature Store can also ensure feature consistency across the enterprise. This step is essential because it ensures that the correct feature data is used and has a common definition throughout the organization. Adopting a single enterprise feature store helps to standardize feature definitions across the organization.
Feature Store provides a structured way to manage and enforce data governance policies. This includes defining who can access and modify features, ensuring compliance with regulations, and maintaining data quality and feature consistency across the entire lifecycle. It ensures that features are consistent and up-to-date across all models and applications that use them. This helps prevent feature inconsistencies and model drift.
Feature Store also includes versioning capabilities, allowing you to track feature changes over time. This is essential for auditing, debugging, and ensuring reproducibility in machine learning models.
The Domino Features Store gives IT an additional tool for managing pipelines and ensuring security. For data science leaders, it can provide important insights into feature usage and data quality and a tool to document and share best practices for feature development.
And finally, another Domino platform feature further enhances governance. The Domino Data Source Audit Trail ensures that all actions and actors on data access and inputs into feature engineering are logged. That means that all the input data and all actions on the data can be traced back to the source data.
By reducing redundancy, ensuring seamless governance, and improving efficiency, feature stores can lead to cost savings in terms of both time and resources. They’ve become an indispensable component in the data fabric for data science. Learn more about the Domino Feature Store.
Tim helps translate AI and machine learning technology into business and social value. In a 20+ year career in data analytics and AI, Tim has led technology marketing initiatives for companies of all sizes, from startups to tech giants.