Qualitative variation


Qualitative variation

An index of qualitative variation (IQV) is a measure of statistical dispersion in nominal distributions. There are a variety of these, but they have been relatively little-studied in the statistics literature. The simplest is the variation ratio, while the most sophisticated is the information entropy.

Properties

There are various indices of qualitative variation; a number are summarized and devised by Wilcox Harv|Wilcox|1967, Harv|Wilcox|1973, who requires the following standardization properties to be satisfied:
* Variation varies between 0 and 1.
* Variation is 0 if and only if all cases belong to a single category.
* Variation is 1 if and only if cases are evenly divided across all category. [This can only happen if the number of cases is a multiple of the number of categories.]

In particular, the value of these standardized indices does not depend on the number of categories or number of samples.

For any index, the closer to uniform the distribution, the larger the variance, and the larger the differences in frequencies across categories, the smaller the variance.

Indices of qualitative variation are in this sense complementary to information entropy, which is maximized when all cases belong to a single category and minimized in a uniform distribution, but they are not complementary in the sense of a particular IQV equaling 1 minus entropy. Indeed, information entropy can be used as an index of qualitative variation.

One characterization of a particular index of qualitative variation (IQV) is as a ratio of observed differences to maximum differences.

Formulas

Wilcox gives a number of formulas for various indices of QV Harv|Wilcox|1973, the first, which he designates DM for "Deviation from the Mode", is a standardized form of the variation ratio, and is analogous to variance as deviation from the mean.

One formula for IQV, [ [http://www.xycoon.com/qualitative_variation.htm IQV at xycoon] ] given as M2 in Harv|Gibbs|1975|p=472 is:: ext{IQV} := frac{K}{K-1}left(1-sum_{i=1}^K p_i^2 ight)where "K" is the number of categories, and p_i = f_i/N is the proportion of observations that fall in a given category "i". The factor of frac{K}{K-1} is for standardization.

The unstandardized index, left(1-sum_{i=1}^K p_i^2 ight), denoted as M1 Harv|Gibbs|1975|p=471, can be interpreted as the likelihood that a random pair of samples will belong to the same category Harv|Lieberson|1969|p=851, so this formula for IQV is a standardized likelihood of a random pair falling in the same category. M1 and M2 can be interpreted in terms of variance of a multinomial distribution Harv|Swanson|1976 (there called an "expanded binomial model").

Evaluation of indices

Different indices give different values of variation, and may be used for different purposes: several are used and critiqued in the sociology literature especially.

If one wishes to simply make ordinal comparisons between samples (is one sample more or less varied than another), the choice of IQV is relatively less important, as they will often give the same ordering.

In some cases it is useful to not standardize an index to run from 0 to 1, regardless of number of categories or samples Harv|Wilcox|1973|pp=338, but one generally so standardizes it.

Notes

References

* Citation
last1=Gibbs
first1=Jack P.
last2=Poston, Jr.
first2=Dudley L.
title=The Division of Labor: Conceptualization and Related Measures
journal=Social Forces
volume=53
issue=3
year=1975
month=March
pages=468–476
doi=10.2307/2576589
id=JSTOR stable URL|0037-7732(197503)53%3A3%3C468%3ATDOLCA%3E2.0.CO%3B2-T

* Citation
last=Lieberson
first=Stanley
title=Measuring Population Diversity
year=1969
month=December
volume=34
issue=6
journal=American Sociological Review
pages=850–862
doi=10.2307/2095977
id=JSTOR stable URL|0003-1224(196912)34%3A6%3C850%3AMPD%3E2.0.CO%3B2-O

* Citation
last=Swanson
first=David A.
title=A Sampling Distribution and Significance Test for Differences in Qualitative Variation
journal=Social Forces
volume=55
issue=1
year=1976
month=September
pages=182–184
doi=10.2307/2577102
id=JSTOR stable URL|0037-7732%28197609%2955%3A1%3C182%3AASDAST%3E2.0.CO%3B2-U

* Citation
last=Wilcox
first=Allen R.
title=Indices of qualitative variation
year=1967
url=http://www.ornl.gov/info/reports/1967/3445605133753.pdf

* Citation
last=Wilcox
first=Allen R.
title=Indices of Qualitative Variation and Political Measurement
year=1973
month=June
volume=26
issue=2
journal=The Western Political Quarterly
pages=325–343
doi=10.2307/446831
id=JSTOR stable URL|0043-4078(197306)26%3A2%3C325%3AIOQVAP%3E2.0.CO%3B2-Z

See also

*statistical dispersion

Other measures of dispersion for nominal distributions

*Information entropy
*Variation ratio


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Variation ratio — The variation ratio is a simple measure of statistical dispersion in nominal distributions; it is the simplest measure of qualitative variation.It is defined as the percent of cases which are not the mode::mathbf{v} := 1 frac{f m}{N}.While a… …   Wikipedia

  • Qualitative Heuristik — ist eine von Gerhard Kleining entworfene sozialwissenschaftliche und psychologische Methodologie [1], die die Entwicklung und Anwendung von Entdeckungsverfahren in regelgeleiteter Form [2] zum Gegenstand hat. Sie konzipiert den Forschungsprozess… …   Deutsch Wikipedia

  • variation — variational, variative /vair ee ay tiv/, adj. variationally, variatively, adv. /vair ee ay sheuhn/, n. 1. the act, process, or accident of varying in condition, character, or degree: Prices are subject to variation. 2. an instance of this: There… …   Universalium

  • qualitative trait — A trait that shows discontinuous variation i.e. individuals can be assigned to one of a small number of discrete classes …   Glossary of Biotechnology

  • discontinuous variation — Variation where individuals can be classified as belonging to one of a set of discrete, non overlapping classes. Generated by simple genetic control of a trait (one or a small number of genes, each of large effect) and involving minimal non… …   Glossary of Biotechnology

  • discontinuous variation — Phenotypic variation in an animal population in which the characters do not grade into each other; qualitative inheritance; see continuous variation …   Dictionary of invertebrate zoology

  • Pythagoreans and Eleatics — Edward Hussey PYTHAGORAS AND THE EARLY PYTHAGOREANS Pythagoras, a native of Samos, emigrated to southern Italy around 520, and seems to have established himself in the city of Croton. There he founded a society of people sharing his beliefs and… …   History of philosophy

  • List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… …   Wikipedia

  • Statistical dispersion — In statistics, statistical dispersion (also called statistical variability or variation) is variability or spread in a variable or a probability distribution. Common examples of measures of statistical dispersion are the variance, standard… …   Wikipedia

  • Karyotype — A karyotype is the number and appearance of chromosomes in the nucleus of an eukaryotic cell. The term is also used for the complete set of chromosomes in a species, or an individual organism.[1][2]p28[3] Karyotypes describe the number of… …   Wikipedia