Subject archive for "machine-learning-engineer"

Data Science

Evaluating Ray: Distributed Python for Massive Scalability

Dean Wampler provides a distilled overview of Ray, an open source system for scaling Python systems from single machines to large clusters. If you are interested in additional insights, register for the upcoming Ray Summit.

By Dean Wampler14 min read

Data Science

Evaluating Generative Adversarial Networks (GANs)

This article provides concise insights into GANs to help data scientists and researchers assess whether to investigate GANs further. If you are interested in a tutorial as well as hands-on code examples within a Domino project, then consider attending the upcoming webinar, “Generative Adversarial Networks: A Distilled Tutorial”.

By Domino6 min read


On Being Model-driven: Metrics and Monitoring

This article covers a couple of key Machine Learning (ML) vital signs to consider when tracking ML models in production to ensure model reliability, consistency and performance in the future. Many thanks to Don Miner for collaborating with Domino on this article. For additional vital signs and insight beyond what is provided in this article, attend the webinar.

By Ann Spencer7 min read

Data Science

Model Interpretability with TCAV (Testing with Concept Activation Vectors)

This Domino Data Science Field Note provides very distilled insights and excerpts from Been Kim’s recent MLConf 2018 talk and research about Testing with Concept Activation Vectors (TCAV), an interpretability method that allows researchers to understand and quantitatively measure the high-level concepts their neural network models are using for prediction, “even if the concept was not part of the training". If interested in additional insights not provided in this blog post, please refer to the MLConf 2018 video, the ICML 2018 video, and the paper.

By Domino6 min read

Machine Learning

Creating Multi-language Pipelines with Apache Spark or Avoid Having to Rewrite spaCy into Java

In this guest post, Holden Karau, Apache Spark Committer, provides insights on how to create multi-language pipelines with Apache Spark and avoid rewriting spaCy into Java. She has already written a complementary blog post on using spaCy to process text data for Domino. Karau is a Developer Advocate at Google as well as a co-author on High Performance Spark and Learning Spark. She also has a repository of her talks, code reviews, and code sessions on Twitch and Youtube.

By Holden Karau5 min read

Data Science

Data Science vs Engineering: Tension Points

This blog post provides highlights and a full written transcript from the panel, “Data Science Versus Engineering: Does It Really Have To Be This Way?” with Amy Heineike, Paco Nathan, and Pete Warden at Domino HQ. Topics discussed include the current state of collaboration around building and deploying models, tension points that potentially arise, as well as practical advice on how to address these tension points.

By Ann Spencer99 min read

Subscribe to the Domino Newsletter

Receive data science tips and tutorials from leading Data Science leaders, right to your inbox.


By submitting this form you agree to receive communications from Domino related to products and services in accordance with Domino's privacy policy and may opt-out at anytime.