User stories: how Domino helps a data scientist create "unicorn-level deliverables"
By Nick Elprin2014-10-284 min read
We asked our users to tell us stories about how they're using Domino. This is what we heard from Laura Lorenz, a Data Scientist at StockUp.
I recently read The 22 Skills of a Data Scientist and I remember thinking that such an incredibly diverse skill must exist only in a mythological unicorn, not a person living and working in reality. However, many of the tasks that hit my JIRA board require just such a mythological unicorn, as impossible as it may be.
Since I am not such a unicorn, I have found it necessary to utilize services and frameworks that help me with the heavy lifting. Domino provides a growing set of tools that data scientists need to execute solutions to cross-discipline data problems. One that I have found extremely helpful is Launchers, which I will illustrate with two examples from my own experience.
Self-service, on-demand report generation
One of my jobs at Stockup is to provide intermittent data pulls and summary statistics for internal reporting or for external clients. These reports usually come in as one-time requests. Once in a while, however, I get related downstream requests for updates or for similar reports to a new department or client. In one example, I was in the middle of an intense sprint to finish a big web app, but the need for regular updates on the data pull interrupted it as more urgent. This sort of disruption and distraction can be devastating on productivity.
Django is already a great framework for rapidly pulling together Python-based web applications. I initially considered using Django to create an interface for the stakeholders to run the report script on their own as parameters changed in our larger Django project. However, the work in our sprint wouldn't be deployed for another two weeks, and I'd need to define a form, display it in a template, and move my script into a view for them to interact with it
The Launchers feature in Domino took care of all of that for me. All I had to do was upload my unchanged code to my Domino project and build a Launcher pointed at my script, with my desired parameters exposed. Domino handled all the front-end development I would have built-in Django, job queuing on their servers, and emailing the results. Launchers also have a seamless interface that my stakeholders love.
Computationally intensive reports
Another benefit of Launchers is the ability to vary the underlying hardware that runs your script. I received another request to build a self-service reporting tool for a report that was particularly computationally intensive. Similar to the first case, I could plug in my script virtually unchanged, specify that this new Launcher should use high-power hardware (32 cores, 60GB of RAM), and I was all set.
Domino takes all the web development out of deploying simple interfaces to non-technical stakeholders for the scripts you don't want to be constantly re-running and delivering. You can quickly convert your one-off scripts into web forms, productionize your data pulls quickly so non-technical stakeholders can operate by themselves — and you can deliver this MUCH faster than deploying your own web framework. Similarly, you can vary your hardware tier and utilize more powerful servers than you may have access to internally. Launchers, and other data science tools from Domino, can turn otherwise mythical projects into something that can actually be accomplished, often simply and quickly.
Nick Elprin is the CEO and co-founder of Domino Data Lab, provider of the open data science platform that powers model-driven enterprises such as Allstate, Bristol Myers Squibb, Dell and Lockheed Martin. Before starting Domino, Nick built tools for quantitative researchers at Bridgewater, one of the world's largest hedge funds. He has over a decade of experience working with data scientists at advanced enterprises. He holds a BA and MS in computer science from Harvard.
Subscribe to the Domino Newsletter
Receive data science tips and tutorials from leading Data Science leaders, right to your inbox.