Modeling counts 1

Poisson and Negative Binomial

Aleš Vomáčka

Counts are Pretty Common

Number of absences in school per year
Number of minutes on the internet per day
Number of purchases per year
Number of days since last job

Poisson Distribution

Several options for modeling counts. The old school choice is Poisson distribution.
Assumes that events happen at constant rate over a fixed amount of time.
Important implication is that poisson distribution assumes the mean and the variance is the same (often problematic in practice).

Poisson Regression

\[ log(Y) \sim Poisson(\beta \cdot X) \]

We are using a log link. Why?
Counts have bottom bound at zero, but no upper one.
E.g. predicting school absences with math scores and training program type
\[ log(absences) \sim Poisson(\beta_0 + \beta_1 \cdot math + \beta_2 \cdot program) \]

Interpreting Poisson models

Coefficients are in logs. Example:
- 1 unit increase in math score is associated with -0.1 log days of absence.
- Students in the vocational program have on average -1.28 log days of absence compared to the ones in the general program.

	Coefficient
(Intercept)	2.65
Program: General	-
Program: Academic	-0.44
Program: Vocational	-1.28
Math score	-0.01

Interpreting Poisson models - The Better Way

We can exponentiate the coefficients to get (approximate) additive increases in natural units.
- 1 unit increase in math score is associated with 1% decrease in the days of absence.
This approximation only works for small changes, i.e. when the exponentiated coefficients are close 1 (e.g. between 0.9-1.1)

	Exp(Coefficient)
(Intercept)	14.18
Program: General	-
Program: Academic	0.64
Program: Vocational	0.28
Math score	0.99

Interpreting Poisson models - The Best(?) Way

You can also use (average) marginal effects the way you are used to.
- On average, students in vocational program have 7.5 less days of absence than those in the general program.
- On average, one point increase in the math test is associated with 0.04 less days of absence.

	Average Marginal Effect
General - Academic	-3.69
General - Vocational	-7.49
Math score	-0.04

Poisson model visually

Questions?

Over- and Under-dispersion

Poisson regression assumes the mean and variance is the same.
If not, the estimates are biased - both the point estimates and standard errors!
In practice, variance is usually higher than the mean (overdispersion). Rarely, it’s lower (underdispersion).
(An) Solution - Use Negative binomial distribution instead.

Negative Binomial distribution

Similar to Poisson, but mean and variance are decoupled - Over/Under-dispersion is estimated from the data
If the mean and variance are actually the same, both models give the same results.

	Poisson	Negative Binomial
(Intercept)	2.65	2.62
Program: General	-	-
Program: Academic	-0.44	-0.44
Program: Vocational	-1.28	-1.28
Math score	-0.01	-0.01

Poisson vs Negative Binomial

Which one to use? Just use Negative binomial.
The results will be the same or better.

So why learn about Poisson distribution?

It Was Poisson All Along

You have a contingency table, how do you test for a relationship between the two variables?

	Elementary	Highschool (without diploma)	Highschool (with diploma)	University
Male	85	407	384	201
Female	105	396	619	272

That’s right a Chi squared test.
Results: \(\chi^2\) p value = 3.23e-06

It Was Poisson All Along

Now imagine we transformed our data into long format:

Gender	Education	Freq
Male	Elementary	85
Female	Elementary	105
Male	Highschool (without diploma)	407
Female	Highschool (without diploma)	396
Male	Highschool (with diploma)	384
Female	Highschool (with diploma)	619
Male	University	201
Female	University	272

And fitted poisson model with an interaction:

\[ log(Freq) \sim Poisson(Gender + Education + Gender \cdot Education) \]

It Was Poisson All Along

We can then test whether adding the interaction led to statistically significant improvement in model fit.

Term	Df	Deviance	Resid. Df	Resid. Dev	Pr(>Chi)
NULL	NA	NA	7	765.366	NA
Education	3	696.833	4	68.533	1.02e−150
Gender	1	40.298	3	28.235	2.18e−10
Education:Gender	3	28.235	0	0.000	3.24e−06

The p value from poisson model is 3.24e-06, the \(\chi^2\) one was 3.23e-06. They are (practically) the same!
Why?

It Was Poisson All Along

\(\chi^2\) test is an approximate way to compute poisson regression (before it even existed)!
Useful to know for two reason:
- It’s cool.
- A good way to model categorical data.

Unlike \(\chi^2\) test, poisson regression:
- Can control for numerical variables
- Can include more than 2 categorical variables
- A good way to model categorical data in general

Questions?

Assumptions for Poisson/Negative Binomial

The same old story.
Linearity between log counts and predictors.
Conditional distribution follow Poisson/Negative Binomial distribution.
- For Poisson, this means no over/underdispersion.

Questions?

InteRmezzo!