Hyperparameter Tuning
What is hyperparameter tuning?
A hyperparameter is a parameter of the model whose value influences the learning process and whose value cannot be estimated from the training data. Hyperparameters are configured externally before starting the model learning/training process. Hyperparameter tuning is the process of finding the optimal hyperparameters for any given machine learning algorithm.
Choosing hyperparameters
Choosing the optimal set of hyperparameters requires an in-depth understanding of the nature and scale of each hyperparameter. Hyperparameter tuning can be done either manually or by automated methods. Before this, a robust evaluation criterion has to be determined to evaluate the model performance using each set of hyperparameters. A good example of this technique is k-fold cross-validation.
There are two main types of hyperparameter tuning:
- Manual hyperparameter tuning: Manual hyperparameter tuning involves experimenting with different sets of hyperparameters manually using the trial and error method. The results of each trial are tracked and used as feedback to obtain a combination of hyperparameters that yield the highest model performance.
- Automated hyperparameter tuning: In automated hyperparameter tuning, the optimal set of hyperparameters is found by using an algorithm. An automatic hyperparameter tuning technique involves methods in which the user defines a set of hyperparameter combinations or a range for each hyperparameter, and the tuning algorithm runs the trials to find the optimal set of hyperparameters for the model.
Tools for hyperparameter tuning
There are several algorithms designed to do hyperparameter tuning. Many tools are available today which incorporate these hyperparameter-tuning algorithms to help users choose the best set of hyperparameters in a minimum amount of time. Some of the best are described below.
- Scikit-learn: Scikit-learn has the implementation for random search and grid search algorithms used to do hyperparameter tuning in the simplest way possible. These methods are not very optimal but can yield satisfactory results.
- Optuna: Optuna is a framework-agnostic automatic hyperparameter tuning software package available in Python. It works with popular machine learning frameworks and is powered by efficient sampling and pruning algorithms that helps it yield optimal set of hyperparameters.
- Ray Tune: Ray is a simple, open-source Python library for experimentation and hyperparameter tuning. Ray has implementations for different hyperparameter tuning algorithms like population based training (PBT) and bayes optimization. It also supports distributed computing. Ray also supports machine learning and deep learning frameworks such as HuggingFace, XGBoost, and PyTorch.
- HyperOpt : Hyperopt is another widely distributed and asynchronous hyperparameter tuning software package. It implements several industry-leveraged algorithms like random search and Tree-structured Parzen Estimator (TPE) estimator.
- Bayesian optimization: This software package implements Bayesian optimization for hyperparameter tuning. Unlike more computationally intensive approaches like grid search, Bayesian optimization constructs the function under optimization as a posterior distribution and probes random sets of parameters to improve it. This allows for targeted testing of regions of parameters and minimizes the number of trials needed for identifying a good set of hyperparameters.