Data Science Interview: Jason Dolatshahi, Head Data Scientist at Reonomy
By Anna Anisin2015-08-257 min read
I recently caught up with Jason Dolatshahi, Lead Data Scientist at Reonomy.
Jason, tell us a bit about your background and how you became interested in data science.
I always liked learning and solving problems, so I studied mathematics and physics. I used to work in financial marketing, digital advertising, and now I work at reonomy. I’m originally from San Diego, now I live in Brooklyn.
How did you get interested in Data Science and Machine Learning?
At one point in my life, I had a job that was kind of boring. So I made up my mind to learn Python which helped me to build financial models. That’s how I discovered data science and machine learning. This topic seemed to have a lot of cool stuff involved that I didn’t know much about, so I was hooked.
Do you remember when you realized the power of data?
The first time I learned about it was in college; the first time I saw it for myself was on a trading desk. During this time my job was in proprietary trading. I detected and drove trading signals from financial markets, gathered data and executed strategies based on these insights.
What kind of work are you doing at Reonomy?
We are a startup in the commercial real estate market. A big part of our business is to create value by using data intelligently. Our product is a research platform for investors, lenders, brokers, and other market participants. Like at many other small companies, the data science work includes a strong dose of backend engineering.
What has been the most surprising insight or development you have found?
The most fundamental physical model I have found so far is the harmonic oscillator, for example the motion of a pendulum. Furthermore I like to paraphrase one of the most famous physicist, who said “career of a young theoretical physicist consists of treating the harmonic oscillator at varying degrees of abstraction”. But similarly, the career of a data scientist consists of learning the fact that “80% of your problem-solving takes place before modeling” at varying degrees of abstraction.
The career of a data scientist consists of learning the fact that “80% of your problem-solving takes place before modeling” at varying degrees of abstraction
Where do you think commercial real estate and data science and ML are going?
I think that commercial real estate will become much more data-driven. Data science will also apply in wider and wider range of domains. For example I think it will increase demand both in private and public sectors for people with quantitative problem solving skills. Medicine in the number one key area, where I expect to see a number of important breakthroughs. The cost of sequencing a genome is on the order of $1000, which is much less than 20 years after the first human genome project.
What are some of your methods (if any) when hiring data scientists? What’s most important in a candidate to be successful?
I am interested in seeing a candidate apply quantitative reasoning. I also like to get a sense for their attitude about things they don’t know. But the most important skill for me is statistical intuition. For that I will give you a practical task to take home and present results later. This will show me if the candidate has the required data and communication skills. I am hiring for Reonomy now, if you think you have what it takes let's chat.
How did you apply your knowledge when teaching data science at General Assembly? Any tips for student practitioners?
Future data scientists should focus on fundamental skills and statistical reasoning. They should try to understand techniques in simple terms , for example why do they work the way they do. Don’t be intimidated by mathematical notation, programming languages, pieces of technology; each of these are just a tool, or a means to an end. Data science is a practical discipline, which is built around tradeoffs. You need to recognize these tradeoffs in your own work and make informed choices about them. This is including, but not limited to the context of building predictive models. And last but not least: “Think deeply of simple things” - another famous quote.
Try to understand techniques in simple terms , for example why do they work the way they do. Don’t be intimidated by mathematical notation, programming languages, pieces of technology; each of these are just a tool, or a means to an end
You have such a diverse data science background, can you share some of the most interesting work you’ve ever done? Or a memorable/meaningful project or two?
I built a backend data processing architecture at a mobile advertising startup, based on Python, redis and Amazon EMR. It lived on in production even after we were acquired. Then I created visualizations in a large advertising cooperation which got a lot of traction and eventually got adopted for variety of business uses. More recently, I created a feature that allows Reonomy’s users to get a holistic picture of ownership for a given person or company while using ElasticSearch. This is a big project, because you have to build streamline workflows that many market participants have to undertake manually with groups of people, and which reduces time to insight by orders of magnitude.
Future data scientists should focus on fundamental skills and statistical reasoning
Is there something in your career that you’re most proud of so far?
The things I like best about data science are the opportunities to learn new things and to be creative. I’ve really enjoyed the work I’ve done where it’s led to amazing results and I also really enjoy teaching and helping people to see through complexity.
What personal/professional projects have you been working on this year, and why/how are they interesting to you?
I am currently focusing on functional programming or programming for mathematicians. One of my favorite subjects is the application of machine learning to biological and medical questions.
What machine learning methods have you found or do you envision being most helpful? What are your favorite tools/applications to work with?
In general, simpler is better. The tools I use on a daily bases are Python, pandas, tmux, vim and unix. I also use scikit-learn.
Jason - Thank you so much for your time! Really enjoyed learning more about what you are achieving at Reonomy.
An American Entrepreneur, Anna Anisin was named a Tech Industry Insider by CNN three years in a row. Anisin was appointed as a CEO for 4Sync and VP of US Ops at 4Shared the fastest growing cloud storage provider of 2012. After stepping down from 4Sync Anna Co-Founded Passare, the number one collaboration software in the funeral industry. After exiting Passare Anna joined the founding team at Domino Data Lab to assist building the most powerful enterprise data science management platform on the market. Currently Anna is running a boutique B2B marketing firm, Formulated.by, and is the founder of the leading data science community and event series, DataScience.Salon.
Subscribe to the Domino Newsletter
Receive data science tips and tutorials from leading Data Science leaders, right to your inbox.