Poisson regression

﻿
Poisson regression

In statistics, Poisson regression is a form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable "Y" has a Poisson distribution, and assumes the logarithm of its expected value can be modelled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables.

In the simplest case with a single independent variable "x", the model takes the form:

:$log \left(operatorname\left\{E\right\}\left(Y\right)\right)=a+bx.,$

If "Y""i" are independent observations with corresponding values "x""i" of the predictor variable, then "a" and "b" can be estimated by maximum likelihood if the number of distinct "x" values is at least 2. The maximum-likelihood estimates lack a closed-form expression and must be found by numerical methods.

Poisson regression models are generalized linear models with the logarithm as the (canonical) link function, and the Poisson distribution function.

Poisson regression in practice

Poisson regression is appropriate when the dependent variable is a count, for instance of events such as the arrival of a telephone call at a call centre. The events must be independent in the sense that the arrival of one call will not make another more or less likely, but the probability per unit time of events is understood to be related to covariates such as time of day.

"Exposure" and offset

Poisson regression is also appropriate for rate data, where the rate is a count of events occurring to a particular unit of observation, divided by some measure of that unit's "exposure". For example, biologists may count the number of tree species in a forest, and the rate would be the number of species per square kilometre. Demographers may model death rates in geographic areas as the count of deaths divided by person−years. More generally, event rates can be calculated as events per unit time, which allows the observation window to vary for each unit. In these examples, exposure is respectively unit area, person−years and unit time. In Poisson regression this is handled as an offset, where the exposure variable enters on the right-hand side of the equation, but with a parameter estimate constrained to 1.

:$log\left\{\left(operatorname\left\{E\right\}\left(Y\right)\right)\right\} = log\left\{\left(mbox\left\{exposure\right\}\right)\right\} + a+bx$which implies :$log\left\{\left(operatorname\left\{E\right\}\left(Y\right)\right)\right\} - log\left\{\left(mbox\left\{exposure\right\}\right)\right\} = log\left\{left\left(frac\left\{operatorname\left\{E\right\}\left(Y\right)\right\}\left\{mbox\left\{exposure ight\right)\right\} = a+bx$

Overdispersion

A characteristic of the Poisson distribution is that its mean is equal to its variance. In certain circumstances, it will be found that the observed variance is greater than the mean; this is known as overdispersion and indicates that the model is not appropriate. A common reason is the omission of relevant explanatory variables.

Another common problem with Poisson regression is excess zeros: if there are two processes at work, one determining whether there are zero events or any events, and a Poisson process determining how many events there are, there will be more zeros than a Poisson regression would predict. An example would be the distribution of cigarettes smoked in an hour by members of a group where some individuals are non-smokers.

Other generalized linear models such as the negative binomial model may function better in these cases.

Use in survival analysis

Algorithms and software for Poisson regression are sometimes used as a computational shortcut in survival analysis: see proportional hazards models.

Implementations

Some statistics packages, such as gretl or EViews, include implementations of Poisson regression.

References

* Cameron, A.C. and P.K. Trivedi (1998). "Regression analysis of count data," Cambridge University Press. ISBN 0-521-63201-3

* Hilbe, J.M. (2007). "Negative Binomial Regression", Cambridge University Press. ISBN 978-0-521-85772-7

Wikimedia Foundation. 2010.

Look at other dictionaries:

• Poisson (disambiguation) — Poisson (meaning fish in French) may refer to:* Siméon Denis Poisson (1781 1840), French mathematician, geometer and physicist, after whom a number of mathematical concepts and physical phenomena are named, including: ** Poisson distribution, a… …   Wikipedia

• Regression analysis — In statistics, regression analysis is a collective name for techniques for the modeling and analysis of numerical data consisting of values of a dependent variable (response variable) and of one or more independent variables (explanatory… …   Wikipedia

• Régression linéaire — Pour les articles homonymes, voir Régression. En statistiques, la régression linéaire désigne une approche pour modéliser la relation entre une variable aléatoire y et un vecteur de variables aléatoires x. De manière générale, le modèle linéaire… …   Wikipédia en Français

• Regression toward the mean — In statistics, regression toward the mean (also known as regression to the mean) is the phenomenon that if a variable is extreme on its first measurement, it will tend to be closer to the average on a second measurement, and a fact that may… …   Wikipedia

• Regression discontinuity design — In statistics, econometrics, epidemiology and related disciplines, a regression discontinuity design (RDD) is a design that elicits the causal effects of interventions by exploiting a given exogenous threshold determining assignment to treatment …   Wikipedia

• Poisson-chat commun — Ameiurus melas Poisson chat commun …   Wikipédia en Français

• Régression de Cox — La régression de Cox (modèle à risque proportionnel)  nommée ainsi d après le statisticien britannique David Cox  est une classe de modèles de survie en statistiques. Les modèles de survie étudient le temps écoulé avant qu un événement… …   Wikipédia en Français

• Poisson migrateur — Migration animale Chez les animaux, la migration est un phénomène présent chez de nombreuses espèces, qui effectuent un déplacement (voire un périple), souvent sur de longues distances, à caractère périodique qui implique un retour régulier dans… …   Wikipédia en Français

• Conway–Maxwell–Poisson distribution — Conway–Maxwell–Poisson parameters: support: pmf: cdf …   Wikipedia

• Conway-Maxwell-Poisson distribution — Probability distribution name =Conway Maxwell Poisson pdf cdf type =density parameters =lambda > 0, u geq 0 support =x in {0,1,2,dots} pdf =frac{lambda^x}{(x!)^ u}frac{1}{Z(lambda, u)} cdf =sum {i=0}^x mathbb{P}(X = i) mean =sum {j=0}^infty… …   Wikipedia