3 Companies, 3 Ways to Structure Data Science
Domino Data Lab2021-03-24 | 8 min read
If there’s one thing that data science leaders likely agree on, it’s that there’s no “right” way to organize data science teams as you build out your enterprise strategy to accelerate model velocity.
Each model—centralized, distributed (also called federated) or hub-and-spoke—has its pros and cons, and companies have found success developing enterprise-grade data science capabilities using each approach.
The key, it turns out, is choosing the right model for your organization.
Data science leaders Matt Cornett (from Transamerica), Patrick Harrison (from a global financial intelligence company), and Brian Loyal (from Bayer Crop Science) offer insight into how different factors play into the decision of what organizational structure to use and when as you build out an enterprise data science strategy. Their comments were part of a webinar on Best Practices for Driving Outcomes with Data Science and are paraphrased below.
Typical data science organizational structures
Matt, Patrick and Brian highlighted different organizational structures, each of which they say has its pros and cons.
- The centralized model: Matt Cornett, who serves as Director of Data Science at Transamerica, leads a centralized team that reports to the company’s Chief Administrative Officer. Centralized teams like Transamerica’s bring the function together under a single leader to oversee talent, infrastructure and tooling, best practices, and knowledge sharing. One challenge these teams can face is the potential perception that they’re disconnected from business priorities. To combat this, Matt advises centralized teams to look for ways to stay as connected to the business as possible. For example, engagement managers on his team actively work to build relationships with business staff on the frontlines and expand their domain knowledge as they develop solutions.
- The distributed or federated model: Patrick Harrison, director of AI engineering at a global financial intelligence company, described a distributed team structure (sometimes called a federated model). Here, there’s no central data science team. Data scientists instead report directly to different business leaders across the enterprise. Federated teams typically have strong alignment with business processes and priorities and can more easily build deep domain expertise. But, as Patrick pointed out, data scientists can often feel isolated from their data science colleagues. It’s also more likely for data scientists working in different parts of the business to duplicate efforts as they work in isolation to solve similar problems. To combat these challenges, Patrick says, federated teams need to develop a data science support network and centralize work on a data science platform to connect data scientists across the organization and enable them to discuss common interests and challenges.
- The hybrid or hub-and-spoke model: Brian Loyal, Cloud Analytics Lead at Bayer Crop Science, discussed his company’s use of a hybrid model. Here, a Center of Excellence is responsible for establishing best practices, building a data science culture throughout the organization, and addressing cross-organizational challenges. Simultaneously, data science teams within different business lines work on specific operational challenges. As Brian pointed out, this approach can offer the best of both the centralized and federated approaches but requires significant coordination and agreement from all participants on what each area’s role should be.
Listen to Matt Cornett, Patrick Harrison, and Brian Loyal discuss their different organizational structures.
How do organizations choose their structure?
Matt, Patrick and Brian shared seven key factors that they believe can help guide companies as they determine the right path. It’s important to note: leaders should consider these factors in totality. There’s as much art as science in the process and sometimes different factors will point leaders in different directions, making it necessary to carefully weigh the pros and cons of each choice.
We’ve broadly placed the different factors in two main categories:
Business variables
Elements including number of employees, number of data scientists, and current analytics maturity make up a significant portion of reasoning for team structure. For example, Transamerica’s Matt Cornett recommends that organizations early in their analytics journey with a growing team of data scientists consider centralizing data science at first to set up infrastructure, peer review, and model governance practices before moving to a more federated model.
He also recommends taking into consideration current IT capabilities to help bring models to production. For example, in cases where IT doesn’t have the bandwidth to fully support data science, having a centralized team that can take on this role is critical.
Data science competencies,
including core mission, types of models under development, and innovation objective.
- Core mission. For example, Brian Loyal recommends that companies with a single mission for data science, such as optimizing a flagship product, consider a centralized team so data scientists are working in lockstep. However, if a company has a broader vision for data science, with different goals across different channels and divisions, it may make more sense to federate teams so data scientists are more closely aligned with each goal.
- Types of models under development. If data scientists will primarily build decision support models, developing dashboards, producing insights, and the like, Patrick Harrison advises using a distributed model to keep teams close to the business they’re supporting. However, in cases where the main focus is on embedding machine learning and IT into productions and services, he suggests creating fewer, more centralized teams that have the resources and mandate to tackle these often cross-functional, large-scale projects.
- Innovation objective. Bayer Crop Science chose a hub-and-spoke model to advance work around computer vision tools for agriculture, using its Center of Excellence to drive best practices in computer vision, which can then be used to solve specific business problems by distributed teams.
Listen to Matt, Patrick and Brian highlight best practices for building a data science organization.
As these leaders show, there are many factors in play, and flexibility is a must—both as you choose your initial path and over time as your organization’s use of data science matures. Taking a step back to think through your company’s current position and your enterprise strategy for data science, however, can help you better pick the organizational structure that will successfully advance your strategy.
Domino powers model-driven businesses with its leading Enterprise MLOps platform that accelerates the development and deployment of data science work while increasing collaboration and governance. More than 20 percent of the Fortune 100 count on Domino to help scale data science, turning it into a competitive advantage. Founded in 2013, Domino is backed by Sequoia Capital and other leading investors.
RELATED TAGS