Counting on Poisson Regression with Scikit-learn

Minjung Gim(mjgim@nims.re.kr)¶

No description has been provided for this image

My talk is based on SciPy talk¶

Title: Counting on Poisson Regression with scikit-learn
Speaker: Thomas J. Fan
The original talk is on Github: thomasjpfan/scipy-2022-poisson
Video
Slide

SciPy Conference ¶

Tutorial¶

Conference¶

Data Life Cycle
Machine Learning & Data Science
Earth, Ocean, Geo, & Atmospheric Science
Engineering
Materials & Chemistry
Physics & Astronomy
Maintainers Track
SciPy Tools

Sprints¶

Contents¶

Short summary of Scipy talk
Generalized Linear Models
Poisson Regression, Poisson loss and Deviance
Scikit-learn tool for Poisson Regression
Case Study: "London Bike Sharing" data set
- Data preprocessing
- Evaluation
- PoissonRegressor
- RandomForestRegressor
- HistGradientBoostingRegressor
- ZeroInflatedRegressor(scikit-lego)
Short review: Poisson Regression example of Scikit-learn and Murphy's Prob ML

Generalized Linear Models¶

Notation¶

$\mathbf x \in \mathbf R^d$: independent(explanatory) variable
$t \in \mathbf R$: dependent variable or target(random variable)
By convention, $\mathbf x$ is $(d+1)$-dim vector consisting of $d$ independent variables concatenated to the number one.
$\mathbf w\in \mathbf R^{d+1}$: weight vector with $ \mathbf w=(w_0,..., w_d)^T$
$y(\mathbf x, \mathbf w)$: output of model or predicted value for $\mathbf x$

Generalized Linear Models¶

In linear regression model, the target variable $p(t|\mathbf x)$ is assumed to be followed Normal distribution and $\mathbb E[t|\mathbf x]$ is modeled by $\mathbf w ^T \mathbf x$.
GLM generalizes linear regression by allowing
- the linear model to be related the target variable via a link function
- the magnitude of the variance of each measurement to be a function of its predicted value
In GLM, dependent (random) variable $p(t|\mathbf x)$ is assumed to be generated from a particular distribution in an exponential family.

$$ \mathbb E[t|\mathbf x] = \mu = g^{-1}(\mathbf w ^T \mathbf x) $$

where $g$ is the link function and $g^{-1}$ is called the mean function.

Remark of GLM¶

Not to be confused with linear basis function model, i.e. $\mathbb E[t|\mathbf x] = \mathbf w ^T \Phi(\mathbf x)$ for some basis vector function $\Phi(\cdot)$.
GLM is not linear w.r.t $\mathbf w$.
Generally, GLM does not have a closed-form solution.
$y(\mathbf x, \mathbf w) = g^{-1}(\mathbf w ^T \mathbf x)$.
Exponential family(e.g. Normal, Binomial, Poisson, Gamma)

$$ p(\mathbf x | \mathbf \eta) = v(\mathbf x)h(\mathbf \eta) \exp \{ \mathbf \eta ^T \mathbf u(\mathbf x) \}, \quad \mathbf \eta: \text{natural parameter} $$

where $\mathbf u(\mathbf x)$ is a (vector) function of $\mathbf x$.

GLM components¶

The GLM consists of three elements:
1. Particular probability distribution for modeling $p(t|\mathbf x)$
2. Linear predictor $\mathbf w ^T \mathbf x$
3. Link function $g$ s.t.
$$ \mathbb E[t|\mathbf x] = \mu = g^{-1}(\mathbf w ^T \mathbf x) $$

Linear Regression¶

Target $t$ $=$ $\mathcal N$$(t|\mu,$$\sigma^2)$, $-$$\infty<$ $t$ $<\infty$

$$ p(t|\mathbf x, \mathbf w, \sigma) = \mathcal N(t|\mu, \sigma^2), \quad \sigma >0 $$

Estimate $\mu$ as Linear predictor $\mathbf w^T \mathbf x$.

$$\begin{align*} p(t_n|\mathbf x_n, \mathbf w, \sigma) & = \mathcal N(t_n|\mathbf w^T \mathbf x_n, \sigma^2)\\ &=\frac{1}{\sqrt{2\pi \sigma^2}} \exp \left( -\frac{1}{2\sigma^2}(t_n - \mathbf w^T \mathbf x_n)^2 \right) \end{align*}$$

Log-likelihood

$$ \log p(t_n|\mathbf x_n, \mathbf w, \sigma) = -\frac{1}{2\sigma^2}(t_n - \mathbf w^T \mathbf x_n)^2 - \frac{1}{2}\log (2\pi \sigma^2) $$

Logistic Regression¶

Target $t$ $=$ $\text{Ber}$$(t|\mu)$, $t=0$ or $1$

$$ \text{Ber}(t|\mu) = \mu ^t (1-\mu)^{1-t}, \quad 0<\mu<1 $$

Estimate $\mu$ as $\sigma(\mathbf w^T \mathbf x)$ where $\sigma(\cdot )$ is a sigmoid function.

$$ p(t_n |\mathbf x_n, \mathbf w) = \text{Ber}(t_n|\sigma(\mathbf w^T \mathbf x_n)) = \sigma(\mathbf w^T \mathbf x_n)^{t_n} (1- \sigma(\mathbf w^T \mathbf x_n))^{1-t_n} $$

Log-likelihood

$$ \log p(t_n |\mathbf x_n, \mathbf w) = t_n \log \sigma(\mathbf w^T \mathbf x_n) + (1-t_n)\log(1- \sigma(\mathbf w^T \mathbf x_n)) $$

Poisson Regression!¶

Agriculture¶

Risk modeling¶

Modeling Count data or Frequency distribution¶

Differential expression analysis for RNAseq using Poisson mixed models
- Nucleic Acids Res. 2017 Jun 20; 45(11): e106.
A Two-Stage Poisson Model for Testing RNA-Seq Data
- De Gruyter. May 16, 2011
Poisson Regression Analysis of the Mortality among a Cohort of World War II Nuclear Industry Workers
- Radiation Research Society. Vol. 123, No. 2 (Aug., 1990). 138-152
Negative-Binomial and quasi-poisson regressions between COVID-19, mobility and environment in São Paulo, Brazil
- Elsevier Environmental Research. 2022 Mar; 204: 112369.

Poisson Regression¶

Regression technique modeling count or frequency data
The target variable $t$ follows a Poisson distribution and $\log \mathbb E[t|\mathbf x]$ can be modeled by a linear combination of unknown parameter $\mathbf w$, i.e. $\log \mathbb E[t|\mathbf x, \mathbf w] = \mathbf w^T \mathbf x$ or $\mathbb E[t|\mathbf x, \mathbf w] = \exp(\mathbf w^T \mathbf x)$.
$t = $ $\text{Poi}$$(t|\mu)$, $t \in {0,1,...}$

$$ \text{Poi}(t|\mu) = e^{-\mu}\frac{\mu^t}{t!},\, \mu>0 $$

Estimate $\mu$ as $\exp(\mathbf w^T \mathbf x)$.

$$ p(t_n |\mathbf x_n, \mathbf w) = \text{Poi}(t_n|\underbrace{\exp(\mathbf w^T \mathbf x_n)}_{=:y_n})=e^{-y_n}\frac{y_n^{t_n}}{t_n!} $$

Poisson distribution¶

The probability of a given number of events occurring in a fixed interval of time if these events occur with a known constant mean rate and independently of the time since the last event.
Positive real number $\mu$ (the expected value of $t$)

$$ p(t|\mu) = \text{Poi}(t|\mu)=e^{-\mu} \frac{\mu ^t}{t!}, \quad t=0,1,2,... $$

$\mathbb E[t|\mu] = \text{Var} [t|\mu ] = \mu$

Poisson Regression¶

$t=$ $\text{Poi}$$(t|\mu),$ $t \in$$\{0,1,...\}$

$$ \text{Poi}(t|\mu) = e^{-\mu}\frac{\mu^t}{t!},\, \mu>0 $$

Estimate $\mu$ as $\exp(\mathbf w^T \mathbf x)$.

$$ p(t_n |\mathbf x_n, \mathbf w) = \text{Poi}(t_n|\underbrace{\exp(\mathbf w^T \mathbf x_n)}_{=:y_n})=e^{-y_n}\frac{y_n^{t_n}}{t_n!} $$

Poisson Regression¶

Observation $\mathcal D = \{ (\mathbf x_n, t_n)_{n=1,2,...,N}\}$
By the method of maximum likelihood, we can find the weight vector $\mathbf w$ making the probability as large as possible. Log-likelihood is written as

$$ \sum_{n=1}^N [ t_n \log(y_n) - y_n - \log(t_n !)] = \sum_{n=1}^N [ t_n \mathbf w^T \mathbf x_n - e^{\mathbf w^T \mathbf x_n} - \log(t_n !)] $$

I.e. maximize the log-likelihood

$$ L(\mathbf w):= \sum_{n=1}^N [ t_n \log(y_n) - y_n] =\sum_{n=1}^N [ t_n \mathbf w^T \mathbf x_n - e^{\mathbf w^T \mathbf x_n} ] $$

$-L(\mathbf w)$ is a convex function. But the equation $ \nabla_{\mathbf w}L(\mathbf w)=0$ has no closed-form solution.

How to quantify performance¶

MSE, MAE
Poisson deviance $$ D(\mathbf t, \mathbf y):= \sum_n d(t_n, y_n) = \sum_n 2 \left [\log p(t_n | y_n^*) - \log p(t_n | y_n) \right ] $$
where $\mathbf t = (t_1,t_2,...,t_N)^T$, $\mathbf y = (y_1,y_2,...,y_N)^T$ and $y_n$ is the predicted value for $\mathbf x_n$ and $y_n^*$ is the optimal parameter estimated by fitting the model just to the true output $t_n$.
In the case of Poisson regression, we have $y_n = \exp(\mathbf w ^T \mathbf x_n)$ and $y_n^* = t_n$. Hence

$$\begin{align*} D(\mathbf t, \mathbf y) &= 2\sum_n \left [ (t_n \log(t_n)-t_n-\log(t_n!)) - (t_n \log(y_n)-y_n-\log(t_n!)) \right ]\\ & = 2\sum_n \left [ t_n \log \frac{t_n}{y_n} + y_n - t_n \right] \end{align*}$$

Remark of Poisson deviance¶

$\mathbf t = \mathbf y$ $\Leftrightarrow$ $D(\mathbf t, \mathbf y) = 0$
Maximize log-likelihood $\Leftrightarrow$ Minimize Poisson deviance

Deviance¶

Metric that elicits predicted expectation values of regression targets

Problem	Target Distribution	Target domain	Unit deviance $d(t,y)$
Regression	Normal	$$t\in (-\infty, \infty)$$	$(t-y)^2$
Binary classification	Bernoulli	$t\in\{0,1\}$	$$2 \left [ t\log\frac{t}{y}+(1-t)\log\frac{1-t}{1-y} \right ]$$
Regression with count data	Poisson	$$t\in[0, \infty)$$	$$2\left [ t\log\frac{t}{y}-t+y \right ] $$

Poisson Regression with Scikit-learn¶

PoissonRegressor¶

from sklearn.linear_model import PoissonRegressor
reg = PoissonRegressor()

RandomForestRegressor¶

from sklearn.ensemble import RandomForestRegressor
reg = RandomForestRegressor(criterion="poisson")

HistGradientBoostingRegressor¶

from sklearn.ensemble import HistGradientBoostingRegressor
reg = HistGradientBoostingRegressor(loss="poisson")

PoissonRegressor¶

class sklearn.linear_model.PoissonRegressor(*, alpha=1.0, fit_intercept=True, solver='lbfgs', max_iter=100, tol=0.0001, warm_start=False, verbose=0)

Generalized Linear Model with a Poisson distribution.
This regressor uses the log link function.
Read more

Minimization problem¶

scipy.optimize.minimize is used with L-BFGS-B

$$ \text{min}_{\mathbf w} \frac{1}{2N}\sum_{n=1}^N d(t_n,y_n)+\frac{\alpha}{2} \| \mathbf w \|_2^2 $$

where $\alpha$ is the $L2$ regularization penalty and $d(t,y)=2\left [ t\log\frac{t}{y}-t+y \right ]$ is the Poisson deviance.
Cholesky based Newton solver: PR #23314
Newton-LSMR: PR #23507 (New in version 1.2.)

Regularization by Default!¶

PoissonRegressor(alpha=1.0)

RandomForestRegressor¶

class sklearn.ensemble.RandomForestRegressor(n_estimators=100, *, criterion='squared_error', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=1.0, max_leaf_nodes=None, min_impurity_decrease=0.0, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False, ccp_alpha=0.0, max_samples=None)

criterion{“squared_error”, “absolute_error”, “friedman_mse”, “poisson”}, default=”squared_error”
criterion="poisson"(New in version 1.0)

Mathematical formulation of Random forest¶

$\mathcal D = \{ (\mathbf x_n, t_n)_{n=1,2,...,N}$, $\mathbf x_n \in \mathbf R^d$, $t_n \in \mathbf R \}$
Decision tree recursively partitions the input(feature) space s.t. the samples with the same labels or similar target values are grouped together.
$Q_m$: the data at the node $m$ with $n_m$ samples
$\theta=(j, s_m)$ with $j \in \{1,...d\}$ and threshold $s_m \in \mathbf R$
$\theta$ splits $Q_m$ into $Q_m^{left}(\theta)$ and $Q_m^{right}(\theta)$ as

$$ Q_m^{left}(\theta):=\{ (\mathbf x, t) | x_j \leq s_m\},\, \text{where } \mathbf x = (x_1,x_2,...,x_d)^T $$

$$ Q_m^{right}(\theta):= Q_m \setminus Q_m^{left}(\theta) $$

$$ G(Q_m, \theta):= \frac{n_m^{left}}{n_m}H(Q_m^{left}(\theta)) + \frac{n_m^{right}}{n_m}H(Q_m^{right}(\theta)) $$

where $H$ is called an impurity function.
Select parameters $\theta$ that minimize impurity $G$

$$ \theta ^* = \text{argmin}_{\theta} G(Q_m, \theta) $$

Regression criteria of RandomForestRegressor¶

criterion="squared_error"

$$ \bar{t}_m:= \frac{1}{n_m} \sum_{t\in Q_m} t $$

$$ H(Q_m):= \frac{1}{n_m}\sum_{t\in Q_m} (t- \bar{t}_m)^2 $$

criterion="poisson"

$$ H(Q_m):=\frac{1}{n_m}\sum_{t\in Q_m}\left ( t \log\frac{t}{\bar{t}_m} - t + \bar{t}_m \right ) % = \frac{1}{2n_m} \sum_{t\in Q_m} d(t, \bar{t}_m) $$

HistGradientBoostingRegressor¶

class sklearn.ensemble.HistGradientBoostingRegressor(loss='squared_error', *, quantile=None, learning_rate=0.1, max_iter=100, max_leaf_nodes=31, max_depth=None, min_samples_leaf=20, l2_regularization=0.0, max_bins=255, categorical_features=None, monotonic_cst=None, interaction_cst=None, warm_start=False, early_stopping='auto', scoring='loss', validation_fraction=0.1, n_iter_no_change=10, tol=1e-07, verbose=0, random_state=None)

Histogram-based Gradient Boosting Regression Tree(New in version 0.21)
loss{‘squared_error’, ‘absolute_error’, ‘gamma’, ‘poisson’, ‘quantile’}, default=’squared_error’
Read more

cnt	t1	hum	wind_speed	weather_code	is_weekend	season	hour
182	3	93	6	3	1	3	0
138	3	93	5	1	1	3	1
134	2.5	96.5	0	1	1	3	2
72	2	100	0	1	1	3	3
47	2	93	6.5	1	1	3	4

Historical data for bike sharing in London 'Powered by TfL Open Data'(From 1/1/2015 to 31/12/2016)
About data

Metadata¶

cnt - the count of a new bike shares
t1 - real temperature in C
hum - humidity in percentage
wind_speed - wind speed in km/h
weather_code - category of the weather
is_holiday - boolean field - 1 holiday / 0 non holiday
is_weekend - boolean field - 1 if the day is weekend
season - category field meteorological seasons: 0-spring ; 1-summer; 2-fall; 3-winter.
weathe_code category description: 1 = Clear ; mostly clear but have some values with haze/fog/patches of fog/ fog in vicinity 2 = scattered clouds / few clouds 3 = Broken clouds 4 = Cloudy 7 = Rain/ light Rain shower/ Light rain 10 = rain with thunderstorm 26 = snowfall 94 = Freezing Fog

Data preprocessing¶

SplineTransformer¶

from sklearn.preprocessing import SplineTransformer
SplineTransformer(n_knots=13, extrapolation="periodic")

Way to add nonlinear terms instead of pure polynomials of features is to generate spline basis functions for each feature with the SplineTransformer.
Splines are piecewise polynomials, parametrized by their polynomial degree and the positions of the knots. The SplineTransformer implements a B-spline basis.
Generate univariate B-spline bases for features.
Generate a new feature matrix consisting of n_splines=n_knots + degree - 1 (n_knots - 1 for extrapolation="periodic") spline basis functions (B-splines) of polynomial order="degree" for each feature.
Read more

Spline Transformer¶

Hour	0	1	2	3	4	5	6	7	8	9	10	11
0	0.17	0.67	0.17	0	0	0	0	0	0	0	0	0
1	0.02	0.48	0.48	0.02	0	0	0	0	0	0	0	0
$$\vdots$$	$$\vdots$$	$\vdots$	$\vdots$	$\vdots$	$$\vdots$$	$$\vdots$$	$$\vdots$$	$$\vdots$$	$$\vdots$$	$$\vdots$$	$\vdots$	$\vdots$
22	0.67	0.17	0	0	0	0	0	0	0	0	0	0.17
23	0.48	0.48	0.02	0	0	0	0	0	0	0	0	0.02

ColumnTransformer¶

Applies transformers to columns of an array or pandas DataFrame.

categorical = ["is_holiday", "weather_code", "is_weekend", "season"]

preprocessor = ColumnTransformer([
    ("cyclic_hour", SplineTransformer(n_knots=13, extrapolation="periodic"), ["hour"]),
    ("categorical", OneHotEncoder(handle_unknown="ignore"), categorical),
], remainder=MinMaxScaler())

In [9]:

preprocessor

Out[9]:

ColumnTransformer(remainder=MinMaxScaler(),
                  transformers=[('cyclic_hour',
                                 SplineTransformer(extrapolation='periodic',
                                                   n_knots=13),
                                 ['hour']),
                                ('categorical',
                                 OneHotEncoder(handle_unknown='ignore'),
                                 ['is_holiday', 'weather_code', 'is_weekend',
                                  'season'])])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Evaluation: Cross validation¶

TimeSeriesSplit¶

from sklearn.model_selection import TimeSeriesSplit
cv = TimeSeriesSplit(n_splits=50, max_train_size=10_000)

Time Series cross-validator
Training set size: i * n_samples // (n_splits + 1) + n_samples % (n_splits + 1) in the i th split
Test set size: n_samples//(n_splits + 1) by default, where n_samples is the number of samples.
Read more

TimeSeriesSplit¶

$N$ can be uniquely decomsed as

$$ N = dq +r $$

$d=$ n_splits+1 (divisor)
$q=N//d$ (quotient)
$r=N\% d$ (remainder)

Cross validation¶

sklearn.model_selection.cross_validate(estimator, X, y=None, *, groups=None, scoring=None, cv=None, n_jobs=None, verbose=0, fit_params=None, pre_dispatch='2*n_jobs', return_train_score=False, return_estimator=False, return_indices=False, error_score=nan)

Evaluate metric(s) by cross-validation and also record fit/score times.
Read more

Scikit-learn tools for Poisson Regression¶

Ridge¶

from sklearn.linear_model import Ridge
ridge = make_pipeline(preprocessor, Ridge())

$$ \text{min}_{\mathbf w} \frac{1}{2N}\sum_{n=1}^N d(t_n,y_n)+\frac{\alpha}{2} \| \mathbf w \|_2^2 $$

$d(t,y)=(t-y)^2$, $y_n = \exp(\mathbf w ^T \mathbf x_n)$ and the $L2$ regularization penalty $\alpha$.
Test result

	RMSE	Poisson Deviance
Ridge	594.25	272.48

PoissonRegressor¶

from sklearn.linear_model import PoissonRegressor
poisson = make_pipeline(preprocessor, PoissonRegressor(max_iter=300))

$$ \text{min}_{\mathbf w} \frac{1}{2N}\sum_{n=1}^N d(t_n,y_n)+\frac{\alpha}{2} \| \mathbf w \|_2^2 $$

where $d(t,y)=2\left [ t\log\frac{t}{y}-t+y \right ]$, $y_n = \exp(\mathbf w ^T \mathbf x_n)$ and the $L2$ regularization penalty $\alpha$.
Test result

	RMSE	Poisson Deviance
PoissonRegressor	527.773	193.017
Ridge	594.25	272.48

PoissonRegressor vs Ridge¶

Results: errors

PoissonRegressor vs Ridge¶

Distributions

Calibration¶

Alt

PoissonRegressor vs Ridge¶

Calibration

RandomForestRegressor with Poisson¶

from sklearn.ensemble import RandomForestRegressor
rf_poisson = RandomForestRegressor(criterion="poisson", random_state=0, n_samples_split=40)
rf = RandomForestRegressor(random_state=0, min_samples_leaf=40)

Test result

	RMSE	Poisson Deviance
RF (Poisson)	306.235	67.788
RF	401.012	114.321
PoissonRegressor	527.773	193.017
Ridge	594.25	272.48

Random Forest¶

Results: errors

Random Forest¶

Distributions

Random Forest¶

Calibration

HistGradientBoosting with Poisson¶

from sklearn.ensemble import HistGradientBoostingRegressor
hist_poisson = HistGradientBoostingRegressor(random_state=0, loss="poisson", max_depth=4)
hist = HistGradientBoostingRegressor(random_state=0, max_depth=4)

Test result

	RMSE	Poisson Deviance
HGRF (Poisson)	292.367	60.666
HGRF	309.892	74.369
RF (Poisson)	306.235	67.788
RF	401.012	114.321
PoissonRegressor	527.773	193.017
Ridge	594.25	272.48

Histogram-based Gradient Boosting Regression¶

Results: errors

Histogram-based Gradient Boosting Regression¶

Distributions

Histogram-based Gradient Boosting Regression¶

Calibration

Prediction for a test set¶

The last Time Series split set
# of train data: 10_000
# of test data: 341

Zero-Inflated Poisson Regression¶

Zero-Inflated Regression with scikit-lego¶

from sklego.meta import ZeroInflatedRegressor
from sklearn.ensemble import HistGradientBoostingClassifier

poisson_zero = ZeroInflatedRegressor(
    classifier=HistGradientBoostingClassifier(random_state=0),
    regressor=HistGradientBoostingRegressor(loss="poisson"))

Test result

	RMSE	Poisson Deviance
ZHGRF (Poisson)	283.730	54.449
HGRF (Poisson)	292.367	60.666
HGRF	309.892	74.369
RF (Poisson)	306.235	67.788
RF	401.012	114.321
PoissonRegressor	527.773	193.017
Ridge	594.25	272.48

Counting on Poisson Regression with Scikit-learn

Minjung Gim(mjgim@nims.re.kr)¶

My talk is based on SciPy talk¶

SciPy Conference¶

Tutorial¶

Conference¶

Sprints¶

Videos¶

Contents¶

Generalized Linear Models¶

Notation¶

Generalized Linear Models¶

Remark of GLM¶

GLM components¶

Linear Regression¶

Logistic Regression¶

Poisson Regression!¶

Agriculture¶

Risk modeling¶

Modeling Count data or Frequency distribution¶

Poisson Regression¶

Poisson distribution¶

Poisson Regression¶

Poisson Regression¶

How to quantify performance¶

Remark of Poisson deviance¶

Deviance¶

Poisson Regression with Scikit-learn¶

PoissonRegressor¶

RandomForestRegressor¶

HistGradientBoostingRegressor¶

PoissonRegressor¶

Minimization problem¶

Regularization by Default!¶

RandomForestRegressor¶

Mathematical formulation of Random forest¶

Regression criteria of RandomForestRegressor¶

HistGradientBoostingRegressor¶

London Bike Sharing Dataset 🚲¶

Metadata¶

Data preprocessing¶

SplineTransformer¶

Spline Transformer¶

ColumnTransformer¶

Evaluation: Cross validation¶

TimeSeriesSplit¶

TimeSeriesSplit¶

Cross validation¶

Scikit-learn tools for Poisson Regression¶

Ridge¶

PoissonRegressor¶

PoissonRegressor vs Ridge¶

PoissonRegressor vs Ridge¶

Calibration¶

PoissonRegressor vs Ridge¶

RandomForestRegressor with Poisson¶

Random Forest¶

Random Forest¶

Random Forest¶

HistGradientBoosting with Poisson¶

Histogram-based Gradient Boosting Regression¶

Histogram-based Gradient Boosting Regression¶

Histogram-based Gradient Boosting Regression¶

Prediction for a test set¶

Zero-Inflated Poisson Regression¶

Zero-Inflated Regression with scikit-lego¶

ZeroInflatedRegressor¶

ZeroInflatedRegressor¶

ZeroInflatedRegressor¶

Thank you for your attention!¶

SciPy Conference ¶

Videos ¶