Decision stump


Decision stump
An example of a decision stump that discriminates between two of three classes of Iris flower data set: Iris versicolor and Iris virginica. This particular stump achieves 94% accuracy on Iris dataset for these two classes.

A decision stump is a machine learning model consisting of a one-level decision tree.[1] That is, it is a decision tree with one internal node (the root) which is immediately connected to the terminal nodes. A decision stump makes a prediction based on the value of just a single input feature. Sometimes they are also called 1-rules.[2]

Depending on the type of the input feature, several variations are possible. For nominal features, one may build a stump which contains a leaf for each possible feature value[3][4] or a stump with the two leaves, one of which corresponds to some chosen category, and the other leaf to all the other categories.[5] For binary features these two schemes are identical. A missing value may be treated as a yet another category.[5]

For continuous features, usually, some threshold feature value is selected, and the stump contains two leaves — for values below and above the threshold. However, rarely, multiple thresholds may be chosen and the stump therefore contains three or more leaves.

Decision stumps are often[6] used as components (called "weak learners" or "base learners") in machine learning ensemble techniques such as bagging and boosting. For example, a state-of-the-art Viola–Jones face detection algorithm employs AdaBoost with decision stumps as weak learners.[7]

The term "decision stump" has been coined in a 1992 ICML paper by Wayne Iba and Pat Langley.[1][8]

References

  1. ^ a b Wayne Iba and Pat Langley. (1992). Induction of One-Level Decision Trees. Proceedings of the Ninth International Conference on Machine Learning.
  2. ^ Robert C. Holte (1993). "Very Simple Classification Rules Perform Well on Most Commonly Used Datasets". http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.67.2711&rep=rep1&type=pdf. 
  3. ^ Loper, Edward L.; Bird, Steven; Klein, Ewan (2009). Natural language processing with Python. Sebastopol, CA: O'Reilly. ISBN 0-596-51649-5. http://nltk.googlecode.com/svn/trunk/doc/book/ch06.html. 
  4. ^ This classifier is implemented in Weka under the name OneR (for "1-rule").
  5. ^ a b This is what has been implemented in Weka's DecisionStump classifier.
  6. ^ Lev Reyzin and Robert E. Schapire. (2006). How Boosting the Margin Can Also Boost Classifier Complexity. ICML 2006. Page 7.
  7. ^ Paul Viola and Michael J. Jones. (2004). Robust Real-Time Face Detection. International Journal of Computer Vision, 2004.
  8. ^ Jonathan Oliver and David Hand. (1994). Averaging Over Decision Stumps. ECML 1994. doi:10.1007/3-540-57868-4_61
    Quote: "These simple rules are in effect severely pruned decision trees and have been termed decision stumps [cites Iba and Langley]".

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Stump — may refer to: *Stump (tree), the rooted remains of a felled tree *Stump (cricket), one of three small wooden posts which the fielding team attempt to hit with the ball *Stump, in politics, the place where a stump speech is given or an occasion… …   Wikipedia

  • Decision tree learning — This article is about decision trees in machine learning. For the use of the term in decision analysis, see Decision tree. Decision tree learning, used in statistics, data mining and machine learning, uses a decision tree as a predictive model… …   Wikipedia

  • Stump v. Sparkman — SCOTUSCase Litigants=Stump v. Sparkman ArgueDate=January 10 ArgueYear=1978 DecideDate=March 28 DecideYear=1978 FullName=Harold D. Stump, et al. v. Linda Kay Sparkman and Leo Sparkman USVol=435 USPage=349 Citation=98 S. Ct. 1099; 55 L. Ed. 2d 331; …   Wikipedia

  • Stump — Lexique du cricket Le cricket est un sport qui dispose d un lexique complexe : les termes techniques et expressions qu on y emploie de manière spécifique sont nombreux[1]. Sommaire : Haut A B C D E F G H I J K L M N O P Q …   Wikipédia en Français

  • Alternating decision tree — An Alternating Decision Tree (ADTree) is a machine learning methodfor classification. The ADTree data structure and algorithmare a generalization of decision trees and have connections to boosting. ADTrees were introduced by Yoav Freund and Llew… …   Wikipedia

  • Henry Stump — Infobox Person name = Henry Stump image size = caption = birth date = birth place = death date = 29 October 1865 [“Stump, Henry. An Eyewitness to the Baltimore Riot, 19th April, 1861,” Letter from Henry Stump to Mrs. Mary A. Stump, Maryland… …   Wikipedia

  • Umpire Decision Review System — The Umpire Decision Review System (abbreviated as UDRS or DRS) is a new technology based system currently being used on an experimental basis in the sport of cricket. The system was first introduced in Test Cricket for the sole purpose of… …   Wikipedia

  • LPBoost — Linear Programming Boosting (LPBoost) is a supervised classifier from the Boosting family of classifiers. LPBoost maximizes a margin between training samples of different classes and hence also belongs to the class of margin maximizing supervised …   Wikipedia

  • Glossary of cricket terms — Cricket is a team sport played between two teams of eleven. It is known for its rich terminology.[1][2][3] Some terms are often thought to be arcane and humorous by those not familiar with the game.[4] This is a general glossary of the… …   Wikipedia

  • Leg before wicket — Ray Lindwall traps Peter May leg before wicket in the First Test of the 1954–55 Ashes series. In the sport of cricket, leg before wicket (LBW) is one of the ways in which a batsman can be dismissed. An umpire will rule a batsman out LBW under a… …   Wikipedia