Domino Data Science Blog

Eduardo Ariño de la Rubia

Eduardo Ariño de la Rubia is a lifelong technologist with a passion for data science who thrives on effectively communicating data-driven insights throughout an organization. A student of negotiation, conflict resolution, and peace building, Ed is focused on building tools that help humans work with humans to create insights for humans.

Data Science

A quick benchmark of hashtable implementations in R

UPDATE: I am humbled and thankful to have had so much feedback on this post! It started out as a quick and dirty benchmark but I had some great feedback from Reddit, comments on this post, and even from Hadley himself! This post now has some updates. The major update is that R's new.env(hash=TRUE) actually provides the fastest hash table if your keys are always going to be valid R symbols! This is one of the things I really love about the data science community and the data science process. Iteration and peer review is key to great results!

By Eduardo Ariño de la Rubia8 min read

Data Science

Improving Zillow's Zestimate with 36 Lines of Code

Zillow and Kaggle recently started a $1 million competition to improve the Zestimate. We used H2O’s AutoML to generate a solution.

By Eduardo Ariño de la Rubia3 min read

Data Science

Horizontal Scaling for Parallel Experimentation

The amount of time data scientists spend waiting for experiment results is the difference between making incremental improvements and making significant advances. With parallel experimentation, data scientists can run more experiments faster, leaving more time to try novel and unorthodox approaches—the kind that leads to exponential improvements and discoveries.

By Eduardo Ariño de la Rubia6 min read

Data Science

Multicore Data Science with R and Python

This post shows a number of different package and approaches for leveraging parallel processing with R and Python.

By Eduardo Ariño de la Rubia16 min read

Product Updates

Git Integration in Domino

We recently released new functionality that provides first-class integration between Domino and git. This post describes the new feature, and describes our perspective on the unique requirements of version control in the context of data science—as distinct from software engineering—workflows.

By Eduardo Ariño de la Rubia5 min read

Data Science

The Cost of Doing Data Science on Laptops

At the heart of the data science process are the resource intensive tasks of modeling and validation. During these tasks, data scientists will try and discard thousands of temporary models to find the optimal configuration. Even for small data sets, this could take hours to process.

By Eduardo Ariño de la Rubia6 min read

Subscribe to the Domino Newsletter

Receive data science tips and tutorials from leading Data Science leaders, right to your inbox.

*

By submitting this form you agree to receive communications from Domino related to products and services in accordance with Domino's privacy policy and may opt-out at anytime.