To Jupyter and Beyond
By Nick Elprin2015-09-014 min read
TLDR; Domino now supports Jupyter with R, Python, and Julia kernels as well as terminal access. It now renders ipynb files in the browser, letting you more easily share, compare, and discuss notebooks; and it lets you run or schedule notebooks as batch jobs, making notebooks a great reporting tool.
Despite the increasing popularity of tools for interactive, exploratory analysis, we frequently hear data scientists lament how time-consuming and annoying it can be to install and configure them, and how they don't have good tools to share and collaborate on notebooks. From day one, we've been building Domino to address both of these problems: making it easy for data scientists to "just get up and running," while facilitating collaboration and sharing among teams.
To that end, our latest release includes “one-click” access to three great tools for interactive data science work: Jupyter Notebook, which provides access to R, Python, Julia and a shell; Rodeo, a new IDE for interactive analysis in Python; and Beaker Notebooks, a powerful multi-language notebook platform.
Over the next few weeks, I’ll describe each of these in more detail. Today, I’ll describe some recent improvements we've added that make Jupyter even more powerful on Domino, especially for collaborative team workflows. Full documentation about using Jupyter in Domino is on our help site.
The Basics: Running Jupyter
Domino lets you start a Jupyter notebook server on any type of hardware with one click. You can control what packages are installed (though we have a lot installed by default) and your notebook server will have access to all files in your project.
Domino will start Jupyter on a machine with your selected hardware and copy your project files there. When the server is ready, you’ll see a link to open it:
Clicking the “Open session” button will take you into the Jupyter UI.
Our default Jupyter installation has kernels for Python 2.7, R, and Julia. It also supports creating Terminal sessions. We support customizations to the Jupyter installation, so you can use whatever kernels you want.
Viewing Notebook Files
It can take a minute to spin up a server, and in many cases, it’s important to be able to quickly get a view of what notebooks are available.
Domino now renders .ipynb files in your project directly through the web UI, so you can see the contents of a notebook without running a whole server. The image below shows what happens if you simply browse to view an ipynb file — there is no Jupyter server running here.
This lets you turn your Domino projects into a powerful notebook gallery to share with your colleagues. If someone sees a notebook they like, they can spin it up with one click on a running server. Or they can fork your project to make their own changes.
Comparison and Commenting
Domino already provides powerful collaboration tools for data science work, such as comparing results between experiments and facilitating discussion. Now these features work great with ipynb files, too. For example, you can compare two different sessions you worked in and see the differences between the two versions of your notebook.
And like any other file or result, you can leave comments about notebooks, which will be shared with your colleagues.
Using Notebooks for Reporting
In addition to running notebooks interactively and viewing them statically, Domino lets you run notebook files as batch jobs: we’ll calculate the notebook and save the result as HTML, which we’ll host on the web so your colleagues can see it. To run a notebook as a "batch" job, you can either (a) click the "run" button next to the notebook in the files view; (b) or do a normal "domino run" command from your CLI, just specify the notebook name (e.g., "domino run foo.ipynb").
Alternatively, you can set your notebooks to run on a schedule, so your calculated, rendered notebook can be sent out as a report.
Nick Elprin is the CEO and co-founder of Domino Data Lab, provider of the open data science platform that powers model-driven enterprises such as Allstate, Bristol Myers Squibb, Dell and Lockheed Martin. Before starting Domino, Nick built tools for quantitative researchers at Bridgewater, one of the world's largest hedge funds. He has over a decade of experience working with data scientists at advanced enterprises. He holds a BA and MS in computer science from Harvard.
Subscribe to the Domino Newsletter
Receive data science tips and tutorials from leading Data Science leaders, right to your inbox.