Empowering individuals to participate in better health outcomes
Evidation transforms research into production-grade models in as little as eight weeks with Enterprise MLOps
Data Science at Evidation
Evidation is on a mission to empower everyone to participate in better health outcomes. To achieve this, the company looks at patients’ lived experience beyond healthcare facilities. It partners with individuals, life sciences companies, and healthcare organizations to better understand health and disease in everyday life and connect individuals with ground-breaking research and health programs.
The Evidation platform continuously measures the health of individuals (at their behest and with their consent) using patient-generated health data (PGHD) they share from apps and wearable technologies, such as smartphones, activity trackers, and smartwatches.
Evidation's network of users includes more than 4 million individuals who are highly motivated to participate in research, sharing millions of data points daily. With this engaged cohort of individuals, clinical trials run on its platform have recruited qualifying patients ten times faster than traditional trials with an average response rate of 67%.
“We’ve shown how research can activate patient engagement across the 100-plus studies we’ve run on our platform,” said Biz Phillips, a Data Science Manager at Evidation. “Now we’re using data science to bring our work to the next level, creating a continuous feedback loop that provides individuals with health insights from this rich, variable data and drives behavioral change.”
Evidation’s latest heart health program, co-developed with the American College of Cardiology, is one example. The program enables individuals with heart conditions to continuously monitor data relevant to their cardiovascular health, such as activity and symptoms information, to identify worsening symptoms, and access personalized content, resources, and tools for managing their heart health better.
To bring individuals new insights faster, Evidation has dramatically streamlined how it manages, develops, deploys, and monitors new models. Leveraging Domino® Enterprise MLOps capabilities and a Snowflake® cloud-based data platform, the company can now turn research into production-grade models in as little as eight weeks.
Challenge
As Evidation began expanding the insights and capabilities it provided participating individuals, it needed to empower its data science teams for rapid innovation and to scale. “We recognized that our existing approach was getting in the way," said Phillips.
"Our homegrown data science system was a challenge to maintain, and the paid platform we used briefly—a unified data analytics platform—was too reliant on Apache Spark™ and couldn’t provide the support, security, or flexibility our data engineers, data scientists, and ML engineers needed.”
The team set its sights on four crucial capabilities:
- Seamless data access so teams could rapidly iterate and innovate.
- Openness and flexibility so data scientists could experiment with the latest tools and compute frameworks.
- Collaboration and reproducibility so teams could build off past work.
- Efficient access to technical resources. “Because we work with very high-frequency data, effectively scaling resources is a big need,” said Piyusha Gade, a health data scientist at Evidation. “Some users only need a small instance on a local machine, while others might need to access GPU resources.”
Solution
Evidation's data science, security, and DevOps teams conducted an in-depth Proof-of-Concept, selecting Domino and Snowflake to meet its infrastructure and security requirements.
"Domino and Snowflake empower our data science team to connect to patient-generated health data (PGHD) seamlessly and prototype rapidly, all within a secure environment," said Gade. "They are integral in our work to create a feedback loop where data is turned into insights and these insights then drive behavioral changes that promote better health outcomes."
Snowflake's cloud-based data warehousing platform serves as Evidation's centralized data repository and feature store to deliver data to its clients and analysis. "Now all of our data sources are centrally provisioned and easily accessible," said Phillips. "As a result, our data scientists, data engineers, and QA engineers have access to continuous, curated, and tested data and can share the same dataset across projects."
The team then uses Domino to query the features stored on Snowflake, select datasets to train models, and streamline efforts across the model lifecycle, including:
- Model management. Domino automatically captures all artifacts for a given project, including datasets, codes, tools, and packages, so teams can easily track data lineage and build on existing work. "We measure success by both the quality of our work and the reproducibility of the results," said Gade. "With Domino, we can now build reproducible workflows that reduce the time to spin up new projects and seamlessly sync our work to GitHub."
- Model development with easy, secure, permission-based access to data and the latest tools, libraries, resources, compute infrastructure (including NVIDIA GPUs), and distributed compute frameworks (including Ray and Dask, which Evidation is interested in using). "Domino does a lot of the DevOps work for us, which has accelerated our development speed," said Phillips.
- Model deployment, simplifying the development of production pipelines. "We can use a model orchestration tool on Domino that enables us to get new and retrained models into production in a fraction of the time that we could with our previous platform," said Gade.
- Model monitoring to detect data and model drift once models are in production. "Domino provides a great dashboard to easily visualize and monitor data drift and model performance," said Gade. "We can view metrics across different models and over time to confirm the machine learning predictions we're sharing haven't lost their predictive accuracy."
All of this is supported by strong security. "We take the security of our data seriously," said Phillips. "Domino and Snowflake have enabled us to work in a security-rich environment, with the centralized provisioning of data and resources and the ability to maintain all projects and data on our network."
The Domino Effect
"We can now spend more time on research and innovation, which enables us to serve more clients and provide more insights," said Gade.
Evidation is on the path to dramatically scale the insights it can provide individuals. Key advances include:
- Delivering new capabilities in as little as eight weeks. Evidation rapidly designed and deployed new models for an epidemiological surveillance system based on research conducted with approximately 340,000 individuals over three flu seasons. The system can detect when participants have the flu, notify them to confirm the individual is experiencing symptoms, and provide a prompt to join an in-clinic study in their area.
- Reducing onboarding times from weeks to less than a day. "Onboarding new team members previously took weeks, but now they can be working and productive the day they start," said Gade.
- Increasing team productivity, saving days and weeks of time. Teams can access data in seconds instead of days and spend less time managing infrastructure. They can trace code and data back to their source in less than a day rather than spending weeks searching for this information. They can access leading NVIDIA GPU accelerated infrastructure immediately rather than waiting for IT to provision systems. "We can now spend more time on research and innovation, which enables us to serve more clients and provide more insights," said Gade.
Resources
Industry
Healthcare, Life Sciences
Location
Headquarters: San Mateo, California
Use Cases
Clinical research
Health programs
Impact
Delivery of new production-grade models in as little as 8 weeks
Reducing onboarding times from weeks to less than a day
Increasing team productivity and scale to serve more clients
Users
40 data scientists, data engineers, and machine learning engineers support research design, analysis, and modeling for both internal and external research initiatives
Solution Components
Data Science Platform: Domino
Data platform: Snowflake®
Data Science Tools: Apache Spark™, DBT™ platform, Julia, Ploomber, Python, R
Server Infrastructure: Amazon Web Services™, NVIDA® GPUs
Visualization/BI tools: Amazon QuickSight™, Tableau®