The Role of Model Governance in Machine Learning and Artificial Intelligence
By David Weedmark2021-08-068 min read
In the world of machine learning (ML) and artificial intelligence (AI), governance is a lifelong pursuit. All models require testing and auditing throughout their deployment and, because models are continually learning, there is always an element of risk that they will drift from their original standards. As such, model governance needs to be applied to each model for as long as it’s being used.
What Is Model Governance?
Model governance is a framework that determines how a company implements policies, controls access to models and tracks their activity. It’s similar to corporate governance, except that it applies specifically to ML/AI models and should be viewed as a subset of model risk management. It’s purpose is to improve efficiency as well as to mitigate risks, and to either reduce or eliminate the financial impact of any hazards that can occur.
The primary focus of model governance involves tracking, testing and auditing. This includes:
- Model lineage, from data acquisition to model building
- Model versions in production, as they are updated based on new data
- Model health in production with model monitoring principles
- Model usage and basic functionality in production
- Model costs
Model governance not only reduces risk, it helps to achieve fundamental business goals like production efficiency and profitability. Tracking model lineage, for example, gives a data science team the ability to reproduce models quickly and efficiently and to access the same data later again if needed. It also provides important information for audits, and gives a record of who was involved in different stages of testing, development or deployment.
Purpose of Model Governance and Why It’s Important
There are two sides to model governance: risks and rewards. Both sides require that the company deploy insights within the model process that can track progress on model development, deployment and monitoring and notify the appropriate people when there is a deviation.
On the risk side, there are two elements to be concerned with. First is the data the model is using. Confidential data needs to be secure and cannot be misused. Not only could this be a violation of the law, but in many business sectors, including finance and healthcare, compliance certifications will mandate how data can be used. Another issue is a company’s own policies and legal contracts regarding the use of data.
A second area of risk comes with the models themselves. Because AI/ML models are designed to learn and evolve, they can learn things they weren’t supposed to learn and develop inaccuracies, or biases, that will taint their decisions. Imagine, for example, a model designed to interpret MRI scans that learns to disregard anatomical anomalies, or a financial analysis model that doesn’t update current interest rates.
On the reward side of the coin, good governance can increase efficiency in the development, testing and deployment of models, freeing resources and revealing new opportunities to explore. In many companies, for example, data scientists tend to stick with a project long after testing and work on deployment, rather than handing the workload off to developers or engineers. Governance can ensure that they provide adequate documentation.
How Model Governance Works
Before a model is mapped out or data sources identified, governance should be established for the project. This includes identifying who will be working on the project, what their roles will be, and what technical resources they need to access.
Unlike other types of governance, which can be set by an executive or board of directors, model governance needs to be a team effort. Representatives from each data science team should be involved, as well as key stakeholders:
- Legal: Someone from the legal team to establish governance guidelines for government or regulatory requirements, as well as for any aspects that fall within corporate governance, such as data usage requirements, nondisclosures, etc.
- Management: To oversee the efficient use of resources, including human resources, to ensure they are being well-appropriated.
- Team Leaders: A representative from each team should include a lead data scientist, engineer, developer, analyst, etc. They are in the best positions to ensure procedures are being followed correctly, that monitoring is being done regularly, and that the teams are collaborating efficiently.
- Business unit: A representative from the business unit should be involved to ensure business goals are being met, clients are being well-served, costs are within expected boundaries, and that unexplored business opportunities are addressed.
As the project begins, all sources of data should be logged. If a dataset is cleaned or otherwise modified, it should be given a version number. The same should apply to algorithms used. Each variation of a model should also be logged and saved so that it can be reused if needed. Fortunately, if you are using an enterprise MLOps platform, this is all done automatically, including specific tools that were used to develop a model and the people who did the work.
When a model is deployed to a production environment, model governance ensures that it is properly audited and tested to ensure it’s operating as quickly and as accurately as expected, and that it’s not experiencing drift or otherwise not performing as predicted. If the model does experience problems, it can be replaced using the last successful version that had been deployed.
Once a model is out in the real world, being used by the people it was designed to serve, it should be routinely monitored. A model experiencing drift, for example, may put out more data or less data than predicted, or it may overutilize CPU or GPU resources. Domino’s Enterprise MLOps Platform does this automatically and will alert a team leader when a model fails to meet, or exceeds, pre-set parameters.
Model Governance in ML and AI
Model governance is much easier and much more effective when it’s developed and managed within an enterprise MLOPs platform. Domino’s MLOPs data science platform begins with enterprise-grade security and credential propagation to control machine learning model access and to protect valuable data. It also provides a fully auditable environment allowing key stakeholders to track projects back in time through a secure work history that meets any regulatory compliance requirements.
If your company has yet to apply model governance to your data science projects, or if you’re looking for ways to systematically implement governance into future projects, take a few minutes to watch a demo of Domino Enterprise MLOps Platform in action, or explore its governance tools for yourself with a free trial.
David Weedmark is a published author who has worked as a project manager, software developer and as a network security consultant.
Subscribe to the Domino Newsletter
Receive data science tips and tutorials from leading Data Science leaders, right to your inbox.