Better interactive data science with Beaker and Rodeo

Nick Elprin2019-09-19 | 4 min read

Return to blog home

Domino has offered support for IPython/Jupyter Notebook for a while, but we recently added support for two newer, up-and-coming tools for interactive data science: Beaker Notebooks, and Rodeo. This post gives a brief overview of each tool and describes how to use them on Domino.

Power tools at your fingertips

The motivation behind Domino is to make data scientists more productive by letting them focus on their analysis without worrying about infrastructure and configuration; and to facilitate collaboration and sharing among teams, by keeping work organized and tracked in a central place.

To that end, Domino now lets you spin up Rodeo or Beaker sessions on big machines, and keeps your files and notebooks stored centrally so it's easier to track, share, and comment on them.

Beaker

Beaker Notebook is a notebook application from the team at Two Sigma Open Source, in some ways similar to Jupyter/IPython notebooks. But in addition to supporting inline code, documentation and visualization in many different languages, Beaker lets you mix languages. That's right: one notebook can mix code from any language they support, and Beaker's slick interop capabilities seamlessly translate data between languages. This even works for DataFrames and more complex types.

Beaker Jupyter notebook application

There's a lot going on under the hood to make that work — it's pretty magical.

This makes Beaker the ultimate weapon for those who believe in "using the best tool for the job": one single analytical workflow can use Python for data prep, R for sophisticated statistical analysis, and HTML with D3, or Latex for beautiful visualization and presentation.

Beaker supports R, Python, Julia, Scala, HTML, Javascript, NodeJS Latex, Java, Groovy, Clojure, Ruby, and Kdb — although right now, Domino's support for Beaker only includes a few of those. Let us know if you want to see others!

You can watch a video of one of Beaker's creators speaking about it at SciPy 2015. You also play with Beaker yourself, without any installation or setup, on Domino. You can create your own projects to do this.

  1. Start a Beaker session by clicking on the "Notebook" menu on your "Runs" dashboard.
Opening a Beaker session in Domino

2. When the server is ready, click the "Open session" button in the right pane.

Opening session in Beaker

3. Create a new notebook, or import one of Beaker's examples, or use the file menu to browse to "/mnt" and choose one of the files in our project ( viz.bkr or interop.bkr)

Creating new notebook and choosing file in Beaker
Creating new notebook and choosing file in Beaker

The viz.bkr notebook in the project shows an example that uses Python to compute a graph, and then HTML/D3/Javascript to visualize it in the Notebook.

The interop.viz notebook shows some nice examples of Beaker's flexibility for translating data between languages.

Rodeo

Rodeo is an open source Python IDE from the folks at yHat. It answers the question, "is there anything like RStudio for Python?"

Rodeo is just that: it's a web-based IDE for editing Python files that gives you a code editor along with a plot viewer and a file browser in one interface. Unlike Python editors designed for building large software systems, Rodeo is tailored for doing data science in Python — especially with its built-in plot viewer.

Rodeo IDE

You can read more about our support for Rodeo on our help site.

Nick Elprin is the CEO and co-founder of Domino Data Lab, provider of the open data science platform that powers model-driven enterprises such as Allstate, Bristol Myers Squibb, Dell and Lockheed Martin. Before starting Domino, Nick built tools for quantitative researchers at Bridgewater, one of the world's largest hedge funds. He has over a decade of experience working with data scientists at advanced enterprises. He holds a BA and MS in computer science from Harvard.