Advice for Aspiring Chief Data Scientists: The Problems You Solve
By Nick Kolegraff2017-10-2410 min read
Nick Kolegraff is the founder of Whiteboard, a strategic innovation company focused on machine learning and AI. Previously, Kolegraff was the Chief Data Scientist at Rackspace and a Principal Data Scientist at Accenture. As a part of Domino’s “Data Science Leaders at Work” guest blogger series, Kolegraff provides advice for data scientists and data science managers to consider when, or if, they decide to take a “chief data scientist” role. This advice includes insights on the mindset you need to have, the types of problems you need to solve, and the people you need to hire. There are three posts total. This second post focuses potential problems you need to solve.
The problems you solve
You can’t execute if you haven’t figured out what you are executing. Sounds pretty simple but is actually hard to do at scale. There are a lot of interconnected pieces. Understanding how and what those pieces impact and map back into the company is critical. Getting clarity around the problems you are solving will help you execute.
There are two things to consider when organizing around problems to solve:
- what type of intelligence capability you are solving for / building
- how you are going to execute it from whiteboard to production
Yet, how you break down these two things and talk through the solutions, needs to be easily understood by those not familiar with the industry. Getting a handle on the types of problems you are solving will help you organize and develop a plan for the types of people you need to hire (see future post next week) and how to map your execution back into the company's goals.
Types of Intelligence Capabilities
When I think of “intelligence”, I break it down into three major types:
- "Hey here are some things to consider? What do you think?”
- Suggesting things a user might like based on inputs
- Search and Recommendation style problems
- “Hey, What is this thing?”
- Identifying things based on inputs
- Image, Voice, Text, Pattern style problems
- "Do this thing”
- Telling a user what to do based on inputs
- Autonomous Systems, Automated Game Players, Bots style problems
These types or categories dictate how error prone you can be with your results when creating quick wins but also help dictate strategy when starting from grassroots. Example: you might use a suggestive approach out of the gate and then use the results from the suggestive approach to help with a recognition problem later down the road. Each of these problem buckets sets you up for questions to ask, which given the answer to these questions, will define how you tackle the problem, develop your execution strategy, and most importantly communicate effectively what you are trying to do.
How you Execute
Once you’ve identified the type of problem and realize that what I have laid out earlier is a very macro picture, you’ll want to do further understandings of things like market potential, scope, feasibility, cost, etc. But all this boils down to what is the objective, solution, and value of what you are creating. Answering the following questions will help develop an execution strategy that fits into the overall picture. Each one of these questions will be expanded on in greater detail later on in this post.
- Are we trying to do a Suggestive, Recognition, or Decisive Action problem?
- Are you Discovering, Establishing, Optimizing, Maintaining?
- Is this an Abstract or Puzzle problem?
- What is the Organizational Lifecycle and Workflow?
Are You Discovering, Establishing, Optimizing, or Maintaining?
Understanding the dynamic and balance between these four objectives will help set the tone and develop a natural way to set expectations for what you are trying to accomplish. You want people to be on the same page with what a system is trying to accomplish. But you also want your teams to work together in a cohesive way and ensure everyone understands that this dynamic is critical. These four different objectives can set the tone early on for what you need to do and who you need to do it.
- Discovering is when you are looking through mass amounts of data to determine insights that can help a business make better decisions. You are exploring existing data in new ways.
- Establishing is when you need to branch into new categories and dimensions of products, markets, and techniques of how you are currently doing something that redefines your current worldview. You have come to an understanding your current worldview needs to change and/or you want to hedge yourself and work on it in case it is a thing later on.
- Optimizing is when you have an existing system and want to find ways to enhance it and make it better. Maybe make it run faster, maybe make the predictions better, or perhaps make it run more efficiently, so you can conserve infrastructure costs.
- Maintaining is when you have an existing system, are ok with how it is operating, and just want to keep it online and running smoothly in the most efficient way possible. You might also want to make minor changes to the system to enhance, or make it better, while supporting the system round the clock.
The skills and capabilities of the four objectives are very different, but all equally needed. They are also entirely dependent on if you are solving an abstract problem or a puzzle problem. The beauty of organizing in this way is, people can serve rotations in each of these objectives to build a feel for what the other does and why it is important. You also are not limited in the problem types you can solve. I view them as different handoffs during the life cycle of developing something.
Implications of abstract and puzzle problems on execution
Abstract problems in business are ambiguous. As a Chief Data Scientist, you typically only have a few opportunities (and sometimes just one opportunity) to solve an abstract problem and are making decisions under extreme uncertainty. Many of the problems you deal with in business will fit this category. You have to know when data is helping you make these decision and when it is providing noise to fuel opinions (see wicked problems wikipedia).
An example of an abstract problem in business,
“[insert name], we want you to add some intelligence capabilities to our product lines that will give us an advantage over our competitors can you come up with some options and get us a working prototype next quarter?”
Welcome to abstract problems.
This is all the direction you get and if you are successful -- you are adding big value. If you aren’t successful....it's your fault and you need to figure out what you did wrong. Puzzle problems are still complex problem, just are a different solve. All the pieces exist, you just need to assemble them to get your objective accomplished. When I say pieces, I’m referring to the resources you have at your disposal to solve the problem, this could be anything from stale technology sitting around, to relationships you build with other departments.
Organizational Lifecycle and Workflow
Organizational lifecycle and workflow are critical to how you execute. (see organizational life cycle wikipedia). Current convention in business strategy will teach you how to play politics and influence the behavior of others around you to accomplish your agenda, I personally think this is b******t. Let people be who they are. Let them make decisions for themselves. At the end of the day, there are two human beings one who wants X and another who wants Y. Put X and Y on the table, be honest about it. You will disagree. Find Z that you can both agree on. Let your results speak, keep improving toward unquestionable results, and they will always win. But remember a company is only as good as it’s people. If there isn’t cohesion among them, working towards a common goal, the system will break. Lifecycle and workflow go hand-in-hand when creating cohesion to execute effectively.
Workflow is how the company operates to maintain levels of quality and consistency. They have certain processes in place to control these things. Sometimes they don’t make sense and are cluttered with bureaucracies. This really starts to matter when you are establishing or optimizing. But realize the importance of consistency and quality when it matters. When you are establishing, consistency and quality are less of a factor, provided you have set the proper expectations, you are just tracking towards making something ‘work’ and proving it is a ‘viable’ thing. When you have something established, you move towards optimizing it, here is when consistency and quality become imperative measures. If you are trying to establish something new and the company has a very waterfall mindset disguised as an agile framework, you are going to have a hard time with a Kanban approach to getting work done. Bottomline: it is your job to create an environment your teams can succeed in, and there is no silver bullet to this, just amazing people.
Domino Note: Nick Kolegraff’s final post in the “Advice for Aspiring Chief Data Scientists” series will be live on the Domino Data Science blog on next week. The final post will cover the people you need to hire. If you missed Kolegraff’s first blog post that covers the mindset you need to have, visit here.
Nick Kolegraff is the founder of True Footage and Whiteboard, a strategic innovation company focused on machine learning and AI. Previously, Kolegraff was the Chief Data Scientist at Rackspace and a Principal Data Scientist at Accenture
Subscribe to the Domino Newsletter
Receive data science tips and tutorials from leading Data Science leaders, right to your inbox.