
- Scikit Learn - Home
- Scikit Learn - Introduction
- Scikit Learn - Modelling Process
- Scikit Learn - Data Representation
- Scikit Learn - Estimator API
- Scikit Learn - Conventions
- Scikit Learn - Linear Modeling
- Scikit Learn - Extended Linear Modeling
- Stochastic Gradient Descent
- Scikit Learn - Support Vector Machines
- Scikit Learn - Anomaly Detection
- Scikit Learn - K-Nearest Neighbors
- Scikit Learn - KNN Learning
- Classification with Naïve Bayes
- Scikit Learn - Decision Trees
- Randomized Decision Trees
- Scikit Learn - Boosting Methods
- Scikit Learn - Clustering Methods
- Clustering Performance Evaluation
- Dimensionality Reduction using PCA
- Scikit Learn Useful Resources
- Scikit Learn - Quick Guide
- Scikit Learn - Useful Resources
- Scikit Learn - Discussion
Scikit Learn - Bayesian Ridge Regression
Bayesian regression allows a natural mechanism to survive insufficient data or poorly distributed data by formulating linear regression using probability distributors rather than point estimates. The output or response y is assumed to drawn from a probability distribution rather than estimated as a single value.
Mathematically, to obtain a fully probabilistic model the response y is assumed to be Gaussian distributed around $X_{w}$as follows
$$p\left(y\arrowvert X,w,\alpha\right)=N\left(y\arrowvert X_{w},\alpha\right)$$One of the most useful type of Bayesian regression is Bayesian Ridge regression which estimates a probabilistic model of the regression problem. Here the prior for the coefficient w is given by spherical Gaussian as follows −
$$p\left(w\arrowvert \lambda\right)=N\left(w\arrowvert 0,\lambda^{-1}I_{p}\right)$$This resulting model is called Bayesian Ridge Regression and in scikit-learn sklearn.linear_model.BeyesianRidge module is used for Bayesian Ridge Regression.
Parameters
Followings table consist the parameters used by BayesianRidge module −
Sr.No | Parameter & Description |
---|---|
1 |
n_iter − int, optional It represents the maximum number of iterations. The default value is 300 but the user-defined value must be greater than or equal to 1. |
2 |
fit_intercept − Boolean, optional, default True It decides whether to calculate the intercept for this model or not. No intercept will be used in calculation, if it will set to false. |
3 |
tol − float, optional, default=1.e-3 It represents the precision of the solution and will stop the algorithm if w has converged. |
4 |
alpha_1 − float, optional, default=1.e-6 It is the 1st hyperparameter which is a shape parameter for the Gamma distribution prior over the alpha parameter. |
5 |
alpha_2 − float, optional, default=1.e-6 It is the 2nd hyperparameter which is an inverse scale parameter for the Gamma distribution prior over the alpha parameter. |
6 |
lambda_1 − float, optional, default=1.e-6 It is the 1st hyperparameter which is a shape parameter for the Gamma distribution prior over the lambda parameter. |
7 |
lambda_2 − float, optional, default=1.e-6 It is the 2nd hyperparameter which is an inverse scale parameter for the Gamma distribution prior over the lambda parameter. |
8 |
copy_X − Boolean, optional, default = True By default, it is true which means X will be copied. But if it is set to false, X may be overwritten. |
9 |
compute_score − boolean, optional, default=False If set to true, it computes the log marginal likelihood at each iteration of the optimization. |
10 |
verbose − Boolean, optional, default=False By default, it is false but if set true, verbose mode will be enabled while fitting the model. |
Attributes
Followings table consist the attributes used by BayesianRidge module −
Sr.No | Attributes & Description |
---|---|
1 |
coef_ − array, shape = n_features This attribute provides the weight vectors. |
2 |
intercept_ − float It represents the independent term in decision function. |
3 |
alpha_ − float This attribute provides the estimated precision of the noise. |
4 |
lambda_ − float This attribute provides the estimated precision of the weight. |
5 |
n_iter_ − int It provides the actual number of iterations taken by the algorithm to reach the stopping criterion. |
6 |
sigma_ − array, shape = (n_features, n_features) It provides the estimated variance-covariance matrix of the weights. |
7 |
scores_ − array, shape = (n_iter_+1) It provides the value of the log marginal likelihood at each iteration of the optimisation. In the resulting score, the array starts with the value of the log marginal likelihood obtained for the initial values of $a\:and\:\lambda$, and ends with the value obtained for estimated $a\:and\:\lambda$. |
Implementation Example
Following Python script provides a simple example of fitting Bayesian Ridge Regression model using sklearn BayesianRidge module.
from sklearn import linear_model X = [[0, 0], [1, 1], [2, 2], [3, 3]] Y = [0, 1, 2, 3] BayReg = linear_model.BayesianRidge() BayReg.fit(X, Y)
Output
BayesianRidge(alpha_1 = 1e-06, alpha_2 = 1e-06, compute_score = False, copy_X = True, fit_intercept = True, lambda_1 = 1e-06, lambda_2 = 1e-06, n_iter = 300, normalize = False, tol=0.001, verbose = False)
From the above output, we can check models parameters used in the calculation.
Example
Now, once fitted, the model can predict new values as follows −
BayReg.predict([[1,1]])
Output
array([1.00000007])
Example
Similarly, we can access the coefficient w of the model as follows −
BayReg.coef_
Output
array([0.49999993, 0.49999993])