Siegel-Tukey test


Siegel-Tukey test

In statistics, the Siegel-Tukey test is a non-parametric test, which applies to data measured at least on an ordinal scale. It tests for the differences in scale between the two groups. It is named after Sidney Siegel and John Tukey.

It is used to determine if one of the two groups tends to have more extreme values in that group, both on the bottom of the scale and on the top, in the tails of the distribution. In other words, the test determines if one of the two groups tends to move away from moderate positions, sometimes to the right, sometimes to the left, but away from the center (of the ordinal scale).

The test was published in 1960 by Sidney Siegel and John Wilder Tukey in the "Journal of the American Statistical Association", with the article "A sum of nonparametric procedure for its ranks spread in unpaired samples."

Principle

The principle is based on the following idea:

If there are two groups A and B with "n" observations for the first and "m" observations for the second group (So there are N = n + m observations total), and ordering all (N) observations in ascending order, it can be expected that the values of the two groups will be mixed or sorted randomly, if there are no differences between the two groups (following hypothesis H0). This would mean that among the scores (ranks), of extreme (high and low) scores, there would be similar values from Group A and Group B.

If Group A were more inclined to extremism (alternative hypothesis H1), then there will be a high proportion of observations from A towards the low or high values, and a reduced proportion at the center of the distribution of both groups.

:* Hypothesis-0: H0 : σ²A = σ²B & MeA = MeB (where σ² and Me are variance and median):* Hypothesis-1: H1 : σ²A > σ²B

Method

We have the two groups A and B, with the following comments (already sorted in ascending order):

A: 33 62 84 85 88 93 97 B: 4 16 48 51 66 98

By combining the groups, a group of 13 entries is obtained:

Group : B B A B B A B A A A A A B (source of value) Value : 4 16 33 48 51 62 66 84 85 88 93 97 98 (sorted) Rank : 1 4 5 8 9 12 13 11 10 7 6 3 2 (alternate extremes)

Where rank is ordered by alternate extremes (rank 1 is lowest, 2 is highest, 3 is next lowest, 4 high, etc.).

The sum of the ranks within each W group:

WA = 5 + 12 + 11 + 10 + 7 + 6 + 3 = 54 WB = 1 + 4 + 8 + 9 + 13 + 2 = 37

If the hypothesis-0 is true, it is expected that the sum of the ranks (taking into account the size of the two groups) is roughly the same.

If one of the two groups is more extremist, its sum should be lower, due to receiving more low scores reserved for the extreme tails, while the other group received high scores assigned to the center (see the analogy to the Wilcoxon-Mann-Whitney test).

After do the sum of the rank, you can compute the U statistics, using this formula:dubious

:::U_1=n_1 * n_2 + {n_1(n_1+1) over 2} - W_1 ,!:::U_1=n_1 * n_2 + {n_2(n_2+1) over 2} - W_2. ,!

Using the example's data, we have:

:::U_1=7 * 6 + {7(7+1) over 2} - 54 = 9,!:::U_2=7 * 6 + {6(6+1) over 2} - 37 = 33.,!

Note that::::U_1 + U_2 = n_1 * n_2,!:::9 + 33 = 6 * 7.,!You can use this as a method to verify if you didn't make any mistake.

Then, you get the lowest value between U1 and U1 and calculate

:::Prleft [x le min(U_1,U_2) ight] = pvalue,!

Where:::X sim Wilcox(m,n).,!

Using the example's data again::::Prleft [x le 9 ight] = 0.1013.,!

To choose if you reject or not de null hypothesis, see P-value.

Remarks

The Siegel-Tukey test is relatively low-power. For example, in the presence of values distributed as a Gaussian, power is equal to 61%.

Moreover, if the idea of equality of median is not met, then the test can answer "significant" if only for that fact (in which case it uses if possible testing of equivalent ranks of Moses).

See also

* Sidney Siegel, John Wilder Tukey
* Non-parametric statistics
* Statistical hypothesis testing


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Test (statistique) — Pour les articles homonymes, voir Test. En statistiques, un test d hypothèse est une démarche consistant à rejeter ou à ne pas rejeter (rarement accepter) une hypothèse statistique, appelée hypothèse nulle, en fonction d un jeu de données… …   Wikipédia en Français

  • Sidney Siegel — (4 January 1916, New York 29 November 1961) was an American psychologist who became especially well known for his work in popularising non parametric statistics for use in the behavioural sciences. He was a co developer of the statistical test… …   Wikipedia

  • Nicht-parametrischer Test — Der Zweig der Statistik, der als parameterfreie Statistik bekannt ist, beschäftigt sich mit parameterfreien statistischen Modellen und parameterfreien statistischen Tests. Andere gebräuchliche Bezeichnungen sind nicht parametrische Statistik oder …   Deutsch Wikipedia

  • Kolmogorov-Smirnov test — In statistics, the Kolmogorov ndash;Smirnov test (also called the K S test for brevity) is a form of minimum distance estimation used as a nonparametric test of equality of one dimensional probability distributions used to compare a sample with a …   Wikipedia

  • List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… …   Wikipedia

  • List of mathematics articles (S) — NOTOC S S duality S matrix S plane S transform S unit S.O.S. Mathematics SA subgroup Saccheri quadrilateral Sacks spiral Sacred geometry Saddle node bifurcation Saddle point Saddle surface Sadleirian Professor of Pure Mathematics Safe prime Safe… …   Wikipedia

  • Nicht-parametrische Statistik — Der Zweig der Statistik, der als parameterfreie Statistik bekannt ist, beschäftigt sich mit parameterfreien statistischen Modellen und parameterfreien statistischen Tests. Andere gebräuchliche Bezeichnungen sind nicht parametrische Statistik oder …   Deutsch Wikipedia

  • Nichtparametrische Statistik — Der Zweig der Statistik, der als parameterfreie Statistik bekannt ist, beschäftigt sich mit parameterfreien statistischen Modellen und parameterfreien statistischen Tests. Andere gebräuchliche Bezeichnungen sind nicht parametrische Statistik oder …   Deutsch Wikipedia

  • Nichtparametrische Verfahren — Der Zweig der Statistik, der als parameterfreie Statistik bekannt ist, beschäftigt sich mit parameterfreien statistischen Modellen und parameterfreien statistischen Tests. Andere gebräuchliche Bezeichnungen sind nicht parametrische Statistik oder …   Deutsch Wikipedia

  • Verteilungsfreie Statistik — Der Zweig der Statistik, der als parameterfreie Statistik bekannt ist, beschäftigt sich mit parameterfreien statistischen Modellen und parameterfreien statistischen Tests. Andere gebräuchliche Bezeichnungen sind nicht parametrische Statistik oder …   Deutsch Wikipedia