Deep learning with DL4J and Domino
By Nick Elprin2014-10-213 min read
The screenshot above shows several different "Runs" of the dl4j code on Domino, each one executing a different dl4j example. The selected run is the multi-core MNIST example, running on a 4-core machine.
Deep Learning for Java
Deep learning is a popular sub-field of machine learning that has proved effective at learning abstract representations in data sets that are typically only "interpretable" by humans. For example, image processing (face recognition, image search), audio classification, and text analysis.
Deeplearning4j is a "commercial-grade, open-source deep-learning library ... meant to be used in business environments, rather than as a research tool." Since we are building Domino to address the same commercial-grade analytical use cases, we couldn't wait to learn more about this library.
So we were excited to attend the SF Data Mining Meetup last night, where dl4j creator Adam Gibson spoke about deep learning. One thing he made very clear — over and over — is that deep learning is not a panacea, and it's not a good solution for all problems. Specifically, his advice was to use it for media (image, video, audio, text) and for time-series analysis (e.g., sensor data feeds). When one person in the audience last night asked about using it for fraud detection, a use case where inspection of the classification logic is critical, Adam literally said something like, "I don't even want to see you here." =) We appreciated the honesty and directness.
As great as it was learning more about dl4j, we were even more excited to get dl4j up and running on Domino =)
Although many Domino users work in Python, R, and other scripting languages, at its core, Domino is an arbitrary code executor and as such, it can run any code in basically any language you want to use. So getting a java example up and running was a piece of cake.
Why this is useful
dl4j is powerful and flexible, but not all data scientists and machine learning practitioners are software engineers and infrastructure experts. Domino saves you the hassle of infrastructure setup and configuration. For example, you don't have to worry about getting an AWS machine set up with all the libraries you need, and getting all your maven dependencies right to compile the dl4j example code.
By using Domino, all that setup is handled for you, and you can run your code on any hardware you want with one click. At the same time, Domino tracks every run of your code, including your results, so you can reproduce past work and share it with other people. Or you can package your model into a self-service webform so non-technical stakeholders can use it.
Nick Elprin is the CEO and co-founder of Domino Data Lab, provider of the open data science platform that powers model-driven enterprises such as Allstate, Bristol Myers Squibb, Dell and Lockheed Martin. Before starting Domino, Nick built tools for quantitative researchers at Bridgewater, one of the world's largest hedge funds. He has over a decade of experience working with data scientists at advanced enterprises. He holds a BA and MS in computer science from Harvard.
Subscribe to the Domino Newsletter
Receive data science tips and tutorials from leading Data Science leaders, right to your inbox.