Subject archive for "machine-learning," page 6

Data Science

Fitting gaussian process models in Python

A common applied statistics task involves building regression models to characterize non-linear relationships between variables. It is possible to fit such models by assuming a particular non-linear functional form, such as a sinusoidal, exponential, or polynomial function, to describe one variable's response to the variation in another. Unless this relationship is obvious from the outset, however, it involves possibly extensive model selection procedures to ensure the most appropriate model is retained. Alternatively, a non-parametric approach can be adopted by defining a set of knots across the variable space and use a spline or kernel regression to describe arbitrary non-linear relationships. However, knot layout procedures are somewhat ad hoc and can also involve variable selection. A third alternative is to adopt a Bayesian non-parametric strategy, and directly model the unknown underlying function. For this, we can employ Gaussian process models.

By Chris Fonnesbeck27 min read

Code

Getting Started with OpenCV

In this article we talk about the foundations of Computer vision, the history and capabilities of the OpenCV framework, and how to make your first steps in accessing and visualising images with Python and OpenCV.

By Dr Behzad Javaheri13 min read

Machine Learning

Speeding up Machine Learning with parallel C/C++ code execution via Spark

The C programming language was introduced over 50 years ago and it has consistently occupied the most used programming languages list ever since. With the introduction of the C++ extension in 1985 and the addition of classes and objects, the C/C++ pair keep a central role in the development of all major operating systems, databases, and performance critical applications in general. Because of its efficiency, C/C++ underpin a large number of machine learning libraries (e.g. TensorFlow, Caffe, CNTK) and widely used tools (e.g. MATLAB, SAS). C++ may not be the first thing that springs to mind when thinking about Machine Learning and Big Data, but it is omnipresent everywhere in the field where lightning fast computations are needed - from Google's Bigtable and GFS to pretty much everything GPU related (CUDA, OCCA, openCL etc.)

By Nikolay Manchev12 min read

Machine Learning

Semi-uniform strategies for solving K-armed bandits

In a previous blog post we introduced the K-armed bandit problem - a simple example of allocation of a limited set of resources over time and under uncertainty. We saw how a stochastic bandit behaves and demonstrated that pulling arms at random yields rewards close to the expectation of the reward distribution.

By Nikolay Manchev6 min read

Perspective

Increasing model velocity for complex models by leveraging hybrid pipelines, parallelization and GPU acceleration

Data science is facing an overwhelming demand for CPU cycles as scientists try to work with datasets that are growing in complexity faster than Moore’s Law can keep up. Considering the need to iterate and retrain quickly, model complexity has been outpacing available compute resources and CPUs for several years, and the problem is growing quickly. The data science industry will need to embrace parallelization and GPU processing to efficiently utilize increasingly complex datasets.

By Nikolay Manchev10 min read

Subscribe to the Domino Newsletter

Receive data science tips and tutorials from leading Data Science leaders, right to your inbox.

*

By submitting this form you agree to receive communications from Domino related to products and services in accordance with Domino's privacy policy and may opt-out at anytime.