Negative multinomial distribution

Negative multinomial distribution
notation: \textrm{NM}(k_0,\,p)
parameters: k0N0 — the number of failures before the experiment is stopped,
pRmm-vector of “success” probabilities,

p0 = 1 − (p1+…+pm) — the probability of a “failure”.
support: k_i \in \{0,1,2,\ldots\}, 1\leq i\leq m
pdf: \Gamma\!\left(\sum_{i=0}^m{k_i}\right)\frac{p_0^{k_0}}{\Gamma(k_0)} \prod_{i=1}^m{\frac{p_i^{k_i}}{k_i!}},
where Γ(x) is the Gamma function.
mean:  \tfrac{k_0}{p_0}\,p
variance:  \tfrac{k_0}{p_0^2}\,pp' + \tfrac{k_0}{p_0}\,\operatorname{diag}(p)
cf: \bigg(\frac{p_0}{1 - p'e^{it}}\bigg)^{\!k_0}

In probability theory and statistics, the negative multinomial distribution is a generalization of the negative binomial distribution (NB(r, p)) to more than two outcomes.[1]

Suppose we have an experiment that generates m+1≥2 possible outcomes, {X0,…,Xm}, each occurring with non-negative probabilities {p0,…,pm} respectively. If sampling proceeded until n observations were made, then {X0,…,Xm} would have been multinomially distributed. However, if the experiment is stopped once X0 reaches the predetermined value k0, then the distribution of the m-tuple {X1,…,Xm} is negative multinomial.

Contents

Negative multinomial distribution example

The table below shows the an example of 400 Melanoma (skin cancer) Patients where the Type and Site of the cancer are recorded for each subject.

Type Site Totals
Head and Neck Trunk Extremities
Hutchinson's melanomic freckle 22 2 10 34
Superficial 16 54 115 185
Nodular 19 33 73 125
Indeterminant 11 17 28 56
Column Totals 68 106 226 400

The sites (locations) of the cancer may be independent, but there may be positive dependencies of the type of cancer for a given location (site). For example, localized exposure to radiation implies that elevated level of one type of cancer (at a given location) may indicate higher level of another cancer type at the same location. The Negative Multinomial distribution may be used to model the sites cancer rates and help measure some of the cancer type dependencies within each location.

If xi,j denote the cancer rates for each site (0\leq i \leq 2) and each type of cancer (0\leq j \leq 3), for a fixed site (i0) the cancer rates are independent Negative Multinomial distributed random variables. That is, for each column index (site) the column-vector X has the following distribution:

X = {X1,X2,X3}∼NM(k0,{p1,p2,p3}).

Different columns in the table (sites) are considered to be different instances of the random multinomially distributed vector, X. Then we have the following estimates of expected counts (frequencies of cancer):

\hat{\mu}_{i,j} = \frac{x_{i,.}\times x_{.,j}}{x_{.,.}}
x_{i,.} = \sum_{j=0}^{3}{x_{i,j}}
x_{.,j} = \sum_{i=0}^{2}{x_{i,j}}
x_{.,.} = \sum_{i=0}^{2}\sum_{j=0}^{3}{{x_{i,j}}}
Example: \hat{\mu}_{1,1} = \frac{x_{1,.}\times x_{.,1}}{x_{.,.}}=\frac{34\times 68}{400}=5.78

For the first site (Head and Neck, j=0), suppose that X=\left \{X_1=5, X_2=1, X_3=5\right \} and XNM(k0 = 10,{p1 = 0.2,p2 = 0.1,p3 = 0.2}). Then:

p_0 = 1 - \sum_{i=1}^3{p_i}=0.5
NM(X | k0,{p1,p2,p3}) = 0.00465585119998784
cov[X_1,X_3] = \frac{10 \times 0.2 \times 0.2}{0.5^2}=1.6
\mu_2=\frac{k_0 p_2}{p_0} = \frac{10\times 0.1}{0.5}=2.0
\mu_3=\frac{k_0 p_3}{p_0} = \frac{10\times 0.2}{0.5}=4.0
corr[X_2,X_3] = \left (\frac{\mu_2 \times \mu_3}{(k_0+\mu_2)(k_0+\mu_3)} \right )^{\frac{1}{2}} and therefore, corr[X_2,X_3] = \left (\frac{2 \times 4}{(10+2)(10+4)} \right )^{\frac{1}{2}} = 0.21821789023599242.

Notice that the pair-wise NM correlations are always positive, where as the correlations between multinomial counts are always negative. As the parameter k0 increases, the paired correlations tend to zero! Thus, for large k0, the Negative Multinomial counts Xi behave as independent Poisson random variables with respect to their means \left ( \mu_i= k_0\frac{p_i}{p_0}\right ).

The marginal distribution of each of the Xi variables is negative binomial, as the Xi count (considered as success) is measured against all the other outcomes (failure). But jointly, the distribution of X=\{X_1,\cdots,X_m\} is negative multinomial, i.e., X \sim NM(k_0,\{p_1,\cdots,p_m\}) .

Parameter estimation

  • Estimation of the mean (expected) frequency counts (μj) of each outcome (Xj) using maximum likelihood is possible. If we have a single observation vector \{x_1, \cdots,x_m\}, then \hat{\mu}_i=x_i. If we have several observation vectors, like in this case we have the cancer type frequencies for 3 different sites, then the MLE estimates of the mean counts are \hat{\mu}_j=\frac{x_{j,.}}{I}, where 0\leq j \leq J is the cancer-type index and the summation is over the number of observed (sampled) vectors (I). For the cancer data above, we have the following MLE estimates for the expectations for the frequency counts:
Hutchinson's melanomic freckle type of cancer (X0) is \hat{\mu}_0 = 34/3=11.33.
Superficial type of cancer (X1) is \hat{\mu}_1 = 185/3=61.67.
Nodular type of cancer (X2) is \hat{\mu}_2 = 125/3=41.67.
Indeterminant type of cancer (X3) is \hat{\mu}_3 = 56/3=18.67.
  • There is no MLE estimate for the NM k0 parameter.[1][2] However, there are approximate protocols for estimating the k0 parameter using the chi-squared goodness of fit statistic. In the usual chi-squared statistic:
\Chi^2 = \sum_i{\frac{(x_i-\mu_i)^2}{\mu_i}}, we can replace the expected-means (μi) by their estimates, \hat{\mu_i}, and replace denominators by the corresponding negative multinomial variances. Then we get the following test statistic for negative multinomial distributed data:
\Chi^2(k_0) = \sum_{i}{\frac{(x_i-\hat{\mu_i})^2}{\hat{\mu_i} \left (1+ \frac{\hat{\mu_i}}{k_0} \right )}}.
Next, we can estimate the k0 parameter by varying the values of k0 in the expression Χ2(k0) and matching the values of this statistic with the corresponding asymptotic chi-squared distribution. The following protocol summarizes these steps using the cancer data above.
DF: The degree of freedom for the Chi-squared distribution in this case is:
df = (# rows – 1)(# columns – 1) = (3-1)*(4-1) = 6
Median: The median of a chi-squared random variable with 6 df is 5.261948.
Mean Counts Estimates: The mean counts estimates (μj) for the 4 different cancer types are:
\hat{\mu}_1 = 185/3=61.67; \hat{\mu}_2 = 125/3=41.67; and \hat{\mu}_3 = 56/3=18.67.
Thus, we can solve the equation above Χ2(k0) = 5.261948 for the single variable of interest -- the unknown parameter k0. In the cancer example, suppose x = {x1 = 5,x2 = 1,x3 = 5}. Then, the solution is an asymptotic chi-squared distribution driven estimate of the parameter k0.
\Chi^2(k_0) = \sum_{i=1}^3{\frac{(x_i-\hat{\mu_i})^2}{\hat{\mu_i} \left (1+ \frac{\hat{\mu_i}}{k_0} \right )}}.
\Chi^2(k_0) = \frac{(5-61.67)^2}{61.67(1+61.67/k_0)}+\frac{(1-41.67)^2}{41.67(1+41.67/k_0)}+\frac{(5-18.67)^2}{18.67(1+18.67/k_0)}=5.261948. Solving this equation for k0 provides the desired estimate for the last parameter.
Mathematica provides 3 distinct (k0) solutions to this equation: {50.5466, -21.5204, 2.40461}. Since k0 > 0 there are 2 candidate solutions.
  • Estimates of probabilities: Assume k0 = 2 and \frac{\mu_i}{k_0}p_0=p_i, then:
\frac{61.67}{k_0}p_0=31p_0=p_1
20p0 = p2
9p0 = p3
Hence, 1 − p0 = p1 + p2 + p3 = 60p0, and p_0=\frac{1}{61}, p_1=\frac{31}{61}, p_2=\frac{20}{61} and p_3=\frac{9}{61}.
Therefore, the best model distribution for the observed sample x = {x1 = 5,x2 = 1,x3 = 5} is X \sim NM\left (2, \left \{\frac{31}{61}, \frac{20}{61},\frac{9}{61}\right\} \right ).

Related distributions

References

  1. ^ a b Le Gall, F. The modes of a negative multinomial distribution, Statistics & Probability Letters, Volume 76, Issue 6, 15 March 2006, Pages 619-624, ISSN 0167-7152, 10.1016/j.spl.2005.09.009.
  2. ^ Zelterman, Daniel (2002). Advanced log-linear models using SAS. SAS Publishing. p. 196. ISBN 9781590470800. 

Further reading

Johnson, Norman L.; Kotz, Samuel; Balakrishnan, N. (1997). "Chapter 36: Negative Multinomial and Other Multinomial-Related Distributions". Discrete Multivariate Distributions. Wiley. ISBN 0-471-12844-9. 


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

  • Negative binomial distribution — Probability mass function The orange line represents the mean, which is equal to 10 in each of these plots; the green line shows the standard deviation. notation: parameters: r > 0 number of failures until the experiment is stopped (integer,… …   Wikipedia

  • Multinomial distribution — Multinomial parameters: n > 0 number of trials (integer) event probabilities (Σpi = 1) support: pmf …   Wikipedia

  • Joint probability distribution — In the study of probability, given two random variables X and Y that are defined on the same probability space, the joint distribution for X and Y defines the probability of events defined in terms of both X and Y. In the case of only two random… …   Wikipedia

  • Probability distribution — This article is about probability distribution. For generalized functions in mathematical analysis, see Distribution (mathematics). For other uses, see Distribution (disambiguation). In probability theory, a probability mass, probability density …   Wikipedia

  • Dirichlet distribution — Several images of the probability density of the Dirichlet distribution when K=3 for various parameter vectors α. Clockwise from top left: α=(6, 2, 2), (3, 7, 5), (6, 2, 6), (2, 3, 4). In probability and… …   Wikipedia

  • Hypergeometric distribution — Hypergeometric parameters: support: pmf …   Wikipedia

  • Multivariate Pólya distribution — The multivariate Pólya distribution, named after George Pólya, also called the Dirichlet compound multinomial distribution, is a compound probability distribution, where a probability vector p is drawn from a Dirichlet distribution with parameter …   Wikipedia

  • Cauchy distribution — Not to be confused with Lorenz curve. Cauchy–Lorentz Probability density function The purple curve is the standard Cauchy distribution Cumulative distribution function …   Wikipedia

  • Noncentral t-distribution — Noncentral Student s t Probability density function parameters: degrees of freedom noncentrality parameter support …   Wikipedia

  • Normal distribution — This article is about the univariate normal distribution. For normally distributed vectors, see Multivariate normal distribution. Probability density function The red line is the standard normal distribution Cumulative distribution function …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”