Treatment learning

Treatment learning

Treatment learning is a process by which an ordered classified data set can be evaluated as part of a data mining session to produce a representative data model. The data model should describe some key property of the data set.

The output of a treatment learning session is a treatment, a conjunction of attribute-value pairs. The size of the treatment is the number of pairs that compose the treatment.

From [1] : Three concepts can be used to define treatment learning: lift, minimum best support, and treatment effect size.

A decision's lift is the change that some chosen decision makes to a subset once that decision has been imposed upon the subset.

Lift is measured by the weighted sum of all outcomes with a particular treatment over the weighted sum of all total outcomes:

:{mathrm{weighted sum(outcomes of data treatment)} over mathrm{weighted sum(all outcomes).

To understand the concept of weighted sums, one must understand that all outcomes are given a weight. More weight is given to the outcomes that we would prefer to happen while less weight is given to outcomes that we do not want.

For example, say in a particular data set there are three different types of outcomes. We might score the outcomes as follows:

:best outcome = 3:average outcome = 2:worst outcome = 1

Also, say our data set includes 6 outcomes, two of each ranking. We then can come up with a weighted sum of all outcomes:

:1left(frac{2}{6} ight) + 2 left(frac{2}{6} ight) + 3 left(frac{2}{6} ight) = 2

Now, say that 3 out of 6 of the outcomes share a common attribute. Of the 6 outcomes, 2 best outcomes and 1 average outcome share a common attribute. Then this attribute will be our treatment and the weighted sum of outcomes with the treatment is this:

:1 left(frac{0}{3} ight) + 2 left(frac{1}{3} ight) + 3 left(frac{2}{3} ight) = 2.67

Then our computed lift would be:

:frac{2.67}{2} = 1.335

A treatment learner may apply some weight to the data representing how often golf was played during the weekend - in our example, we state that a higher weight will be achieved when more golf is played.

The treatment learner may notice that under certain weather conditions, it may be possible to increase the baseline score (for example: it might be noticed that the individual always plays lots of golf on overcast weekends). If a decision is applied such that the baseline score increases, this is said to have the property of "higher" or "increased" lift.

----

While one can usually find better lift values by adding more attributes to a data treatment, eventually the example set becomes too specific to the example data pool given.

All treatment learners keep a minimum best support value in order to constrain the number of attributes used in a data treatment. This is used to avoid overfitting a model, which means to make too specific of an interpretation on a given data pool.

The best support value is obtained by taking the number of best outcomes in the data pool after the data treatment and dividing them by the total number of best outcomes in the data pool without the treatment:

:mathrm{best support} = frac{mathrm{number of best outcomes after treatment{mathrm{number of best outcomes before treatment.

Using our example from before, we had 2 best outcomes before the data treatment. After the data treatment we still had 2 of the best outcomes:

:mathrm{best support} = frac{2}{2} = 1.

This is the maximum number possible for best support. Treatment learners will keep a fixed number for its minimum best support. If the best support for a data treatment falls below this number it is thrown out. This prevents the data treatment from becoming too specific and overfitting the model.

----

A side effect of minimum best support is the small treatment effect. Treatment effect describes the number of attributes that compose a particular treatment. Because of the minimum best support test, the number of attributes used in a data treatment usually remain small. Adding attributes to a data treatment will often cause the data treatment to fall below the minimum best support threshold.

ee also

*Treatment learner
*Data mining

External sources

* [1] T. Menzies, Y. Hu, [http://menzies.us/pdf/03tar2.pdf Data Mining for Very Busy People] . IEEE Computer, October 2003, pgs. 18-25.


Wikimedia Foundation. 2010.

Игры ⚽ Нужен реферат?

Look at other dictionaries:

  • Treatment learner — In data mining, a treatment learner is a program used to find rules that change the expected class distribution (compared to some baseline). A classifier is a treatment learned used for recognition tasks, such as identifying defective items in an …   Wikipedia

  • learning disability — learning difficulty delayed or incomplete intellectual development combined with some form of social malfunction, such as educational or occupational failure or inability of individuals to look after themselves. Good education alters the course… …   The new mediacal dictionary

  • Learning disability — In the United States and Canada, the term learning disability (LD) refers to a group of disorders that affect a broad range of academic and functional skills including the ability to speak, listen, read, write, spell, reason and organize… …   Wikipedia

  • Learning disabilities in special education — Some students with learning disabilities are eligible for special education services in the United States. Evaluation for Learning Disabilities Purposes of evaluation: # Identify children who have learning problems and may need special education… …   Wikipedia

  • Learning problems in childhood cancer — According to the National Cancer Institute, the incidence of childhood cancers has increased over the past 20 years. New medical treatments and technologies are improving the survival rate of childhood cancers however, late effects of cancer and… …   Wikipedia

  • learning theory — ▪ psychology Introduction       any of the proposals put forth to explain changes in behaviour produced by practice, as opposed to other factors, e.g., physiological development.       A common goal in defining any psychological (psychology)… …   Universalium

  • Treatment of Tourette syndrome — Tourette syndrome (also Tourette s syndrome or TS) is an inherited neurological disorder with onset in childhood, characterized by the presence of motor and phonic tics. Treatment of Tourette syndrome has the goal of managing symptoms to achieve… …   Wikipedia

  • Overfitting (machine learning) — For the statistical concept see OverfittingThe concept of overfitting is important in machine learning. Usually a learning algorithm is trained using some set of training examples, i.e. exemplary situations for which the desired output is known.… …   Wikipedia

  • Dynamic treatment regime — In medical research, a dynamic treatment regime (DTR) or adaptive treatment strategy is a set of rules for choosing effective treatments for individual patients. The treatment choices made for a particular patient are based on that individual s… …   Wikipedia

  • Alternative therapies for developmental and learning disabilities — include a range of practices used in the treatment of dyslexia, ADHD, Asperger syndrome, autism, Down syndrome and other developmental and learning disabilities. Treatments include changes in diet, dietary supplements, biofeedback, chelation… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”