Rough set


Rough set

A rough set originated by prof. Zdzisław I. Pawlak is a formal approximation of a crisp set (i.e., "conventional set") in terms of a pair of sets which give the "lower" and the "upper" approximation of the original set. The lower and upper approximation sets themselves are crisp sets in the standard version of rough set theory (Pawlak 1991), but in other variations, the approximating sets may be fuzzy sets as well.

Definitions

This section contains an explanation of the basic framework of rough set theory (proposed originally by Zdzisław I. Pawlak), along with some of the key definitions. A review of this basic material can be found in sources such as Pawlak (1991), Ziarko (1998), Ziarko & Shan (1995), and many others.

Information system framework

Let I = (mathbb{U},mathbb{A}) be an information system (attribute-value system), where mathbb{U} is a non-empty set of finite objects (the universe) and mathbb{A} is a non-empty finite set of attributes such that a:mathbb{U} ightarrow V_a for every a in mathbb{A}. V_a is the set of values that attribute a may take. In words, the information table simply assigns a value in V_a to each attribute a of each object in universe mathbb{U}.

With any P subseteq mathbb{A} there is an associated equivalence relation mathrm{IND}(P):

: mathrm{IND}(P) = left{(x,y) in mathbb{U}^2 mid forall a in P, a(x)=a(y) ight}

The partition of mathbb{U} generated by mathrm{IND}(P) isdenoted mathbb{U}/mathrm{IND}(P) (or mathbb{U}/P) and can becalculated as follows:

: mathbb{U}/mathrm{IND}(P) = otimes left{mathbb{U}/mathrm{IND}({a}) mid ain P ight},

where

: A otimes B = left{Xcap Y mid forall X in A, forall Y in B, X cap Y eq emptyset ight}

If (x,y)in mathrm{IND}(P), then x and y are indiscernible by attributes from P. In words, for any selected subset of attributes P, there will be sets of objects that are indiscernible based on those attributes. These indistinguishable sets of objects therefore define an equivalence or indiscernibility relation, referred to as the P-indiscernibility relation.

Example: equivalence class structure

For example, consider the following information table:

:

To read this decision matrix, look, for example, at the intersection of row O_{3} and column O_{6}, showing P_1^2,P_3^0 in the cell. This means that "with regard to" decision value P_{4}=1, object O_{3} differs from object O_{6} on attributes P_1 and P_3, and the particular values on these attributes for the positive object O_{3} are P_1=2 and P_3=0. This tells us that the correct classification of O_{3} as belonging to decision class P_{4}=1 rests on attributes P_1 and P_3; although one or the other might be dispensable, we know that at least one of these attributes is indispensable.

Writing the rules

Next, from each decision matrix we form a set of Boolean expressions, one expression for each row of the matrix. The items within each cell are aggregated disjunctively, and the individuals cells are then aggregated conjunctively. Thus, for the above table we have the following five Boolean expressions:

:egin{cases}(P_1^1 or P_2^2 or P_3^0) and (P_1^1 or P_2^2) and (P_1^1 or P_2^2 or P_3^0) and (P_1^1 or P_2^2 or P_3^0) and (P_1^1 or P_2^2) \ (P_1^1 or P_2^2 or P_3^0) and (P_1^1 or P_2^2) and (P_1^1 or P_2^2 or P_3^0) and (P_1^1 or P_2^2 or P_3^0) and (P_1^1 or P_2^2) \(P_1^2 or P_3^0) and (P_2^0) and (P_1^2 or P_3^0) and (P_1^2 or P_2^0 or P_3^0) and (P_2^0) \(P_1^2 or P_3^0) and (P_2^0) and (P_1^2 or P_3^0) and (P_1^2 or P_2^0 or P_3^0) and (P_2^0) \(P_1^2 or P_3^0) and (P_2^0) and (P_1^2 or P_3^0) and (P_1^2 or P_2^0 or P_3^0) and (P_2^0)end{cases}

Each statement here is essentially a highly specific (probably "too" specific) rule governing the membership in class P_{4}=1 of the corresponding object. For example, the last statement, corresponding to object O_{10}, states that all the following must be satisfied:
# Either P_1 must have value 2, or P_3 must have value 0, or both.
# P_2 must have value 0.
# Either P_1 must have value 2, or P_3 must have value 0, or both.
# Either P_1 must have value 2, or P_2 must have value 0, or P_3 must have value 0, or any combination thereof.
# P_2 must have value 0.

It is clear that there is a large amount of redundancy here, and the next step is to simplify using traditional Boolean algebra. The statement (P_1^1 or P_2^2 or P_3^0) and (P_1^1 or P_2^2) and (P_1^1 or P_2^2 or P_3^0) and (P_1^1 or P_2^2 or P_3^0) and (P_1^1 or P_2^2) corresponding to objects {O_{1},O_{2}} simplifies to P_1^1 or P_2^2, which yields the implication

:(P_1=1) or (P_2=2) o (P_{4}=1)

Likewise, the statement (P_1^2 or P_3^0) and (P_2^0) and (P_1^2 or P_3^0) and (P_1^2 or P_2^0 or P_3^0) and (P_2^0) corresponding to objects {O_{3},O_{7},O_{10}} simplifies to P_1^2 P_2^0 or P_3^0 P_2^0. This gives us the implication

:(P_1=2 and P_2=0) or (P_3=0 and P_2=0) o (P_{4}=1)

The above implications can also be written as the following rule set:

:egin{cases}(P_1=1) o (P_{4}=1) \(P_2=2) o (P_{4}=1) \(P_1=2) and (P_2=0) o (P_{4}=1) \(P_3=0) and (P_2=0) o (P_{4}=1) end{cases}

It can be noted that each of the first two rules has a "support" of 2 (i.e., the antecedent matches two objects), while each of the last two rules has a support of 3. To finish writing the rule set for this knowledge system, the same procedure as above (starting with writing a new decision matrix) should be followed for the case of P_{4}=2, thus yielding a new set of implications for that decision value (i.e., a set of implications with P_{4}=2 as the consequent). In general, the procedure will be repeated for each possible value of the decision variable. Other rule-learning methods can be found in Pawlak (1991), Stefanowski (1998), Bazan et al. (2004), etc.

Applications

Rough sets can be used as a theoretical basis for some problems in machine learning. They have found to be particularly useful for rule induction and feature selection (semantics-preserving dimensionality reduction). They have been used for missing data estimation as well as understanding HIV. They have also inspired some logical research.Fact|date=February 2007

Extensions

Dominance-based Rough Set Approach (DRSA) is an extension of rough set theory for Multi Criteria Decision Analysis (MCDA), introduced by Greco, Matarazzo and Słowiński (2001). The main change comparing to the classical rough sets is the substitution of the indiscernibility relation by a dominance relation, which permits to deal with inconsistencies typical to consideration of "criteria" and "preference-ordered decision classes".

History

The idea of rough set was proposed by Pawlak (1991) as a new mathematical tool to deal with vague concepts. Comer, Grzymala-Busse, Iwinski, Nieminen, Novotny, Pawlak, Obtulowicz, and Pomykala have studied algebraic properties of rough sets. Different algebraic semantics have been developed by P. Pagliani, I. Duntsch, M. K. Chakraborty, M. Banerjee and A. Mani. These have been extended to more generalized rough sets by D. Cattaneo and A. Mani in particular. Rough sets can be used to represent ambiguity, vagueness and general uncertainty. Fuzzy-rough sets further extend the rough set concept through the use of fuzzy equivalence classes.

ee also

* Algebraic semantics
* Alternative set theory
* Description logic
* Fuzzy logic
* Fuzzy set theory
* Generalized rough set theory
* Granular computing
* Rough-fuzzy hybridization
* Semantics of rough set theory
* Soft computing
* Variable precision rough set
* Version space
* Dominance-based Rough Set Approach

References

*cite journal
last = Bazan
first = Jan
coauthors = Nguyen, Hung Son and Szczuka, Marcin
title = A view on rough set concept approximations
journal = Fundamenta Informaticae
volume = 59
pages = 107–118
date = 2004

*cite journal
last = Chanas
first = Stefan
coauthors = Kuchta, Dorota
title = Further remarks on the relation between rough and fuzzy sets
journal = Fuzzy Sets and Systems
volume = 47
issue = 3
pages = 391–394
date = 1992
doi = 10.1016/0165-0114(92)90305-N

*cite journal
last = Comer
first = Stephen D.
title = An algebraic approach to the approximation of information
journal = Fundamenta Informaticae
volume = 14
issue = 4
pages = 495–502
date = 1991

*cite journal
last = Dubois
first = D.
coauthor = Prade, H.
title = Rough fuzzy sets and fuzzy rough sets
journal = International Journal of General Systems
volume = 17
pages = 191–209
date = 1990
doi = 10.1080/03081079008935107

*cite journal
last = Greco
first = Salvatore
coauthors = Matarazzo, Benedetto and Słowiński, Roman
title = Rough sets theory for multicriteria decision analysis
journal = European Journal of Operational Research
volume = 129
issue = 1
date = 2001
pages = 1–47
doi = 10.1016/S0377-2217(00)00167-3

*cite journal
last = Jensen
first = Richard
coauthors = Shen, Qiang
title = Semantics-Preserving Dimensionality Reduction: Rough and Fuzzy-Rough Based Approaches
journal = IEEE Transactions on Knowledge and Data Engineering
volume = 16
issue = 12
pages = 1457–1471
date = 2004
doi = 10.1109/TKDE.2004.96

*cite journal
last = Nelwamondo
first = Vincent
coauthors = Marwala, Tshilidzi
title = Rough sets: Rough Set Theory for the Treatment of Incomplete Data
journal = Proceedings of the IEEE Conference on Fuzzy Systems.
pages = 338–343
date = 2007

*cite journal
last = Tettey
first = Thando
coauthors = Nelwamondo, F.V. and Marwala, Tshilidzi
title = Rough sets: HIV Data Analysis via Rule Extraction using Rough Sets
journal = Proceedings of IEEE International Conference on Intelligent Engineering Systems
pages = 105–110
date = 2007

*cite journal
last = Pawlak
first = Zdzisław
coauthors = Wong, S. K. M. and Ziarko, Wojciech
title = Rough sets: Probabilistic versus deterministic approach
journal = International Journal of Man-Machine Studies
volume = 29
pages = 81–95
date = 1988

*cite book
last = Pawlak
first = Zdzisław
title = Rough Sets: Theoretical Aspects of Reasoning About Data
publisher = Kluwer Academic Publishing
date = 1991
location = Dordrecht
id = ISBN 0-7923-1472-7

*cite conference
first = Jerzy
last = Stefanowski
title = On rough set based approaches to induction of decision rules
booktitle = Rough Sets in Knowledge Discovery 1: Methodology and Applications
pages = 500–529
publisher = Physica-Verlag
editor = Polkowski, Lech and Skowron, Andrzej
date = 1998
location = Heidelberg

*cite journal
last = Wong
first = S. K. M.
coauthors = Ziarko, Wojciech and Ye, R. Li
title = Comparison of rough-set and statistical methods in inductive learning
journal = International Journal of Man-Machine Studies
volume = 24
pages = 53–72
date = 1986

*cite conference
first = J. T.
last = Yao
coauthors = Yao, Y. Y.
title = Induction of classification rules by granular computing
booktitle = Proceedings of the Third International Conference on Rough Sets and Current Trends in Computing (TSCTC'02)
pages = 331–338
publisher = Springer-Verlag
date = 2002
location = London, UK

*cite conference
first = Wojciech
last = Ziarko
title = Rough sets as a methodology for data mining
booktitle = Rough Sets in Knowledge Discovery 1: Methodology and Applications
pages = 554–576
publisher = Physica-Verlag
date = 1998
location = Heidelberg

*cite conference
first = Wojciech
last = Ziarko
coauthors = Shan, Ning
title = Discovering attribute relationships, dependencies and rules by using rough sets
booktitle = Proceedings of the 28th Annual Hawaii International Conference on System Sciences (HICSS'95)
pages = 293–299
date = 1995
location = Hawaii

External links

* [http://www.roughsets.org The International Rough Set Society]
* [http://www.uit.edu.vn/forum/index.php?act=Attach&type=post&id=19757 Rough set tutorial]
* [http://www.ghastlyfop.com/blog/2006/01/rough-sets-quick-tutorial.html Example-based simple tutorial]


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Dominance-based rough set approach — (DRSA) is an extension of rough set theory for multi criteria decision analysis (MCDA), introduced by Greco, Matarazzo and Słowiński. [1][2][3] The main change comparing to the classical rough sets is the substitution of the indiscernibility… …   Wikipedia

  • Dominance-based Rough Set Approach — (DRSA) is an extension of rough set theory for Multi Criteria Decision Analysis (MCDA), introduced by Greco, Matarazzo and Słowiński Greco, S., Matarazzo, B., Słowiński, R.: Rough sets theory for multicriteria decision analysis. European Journal… …   Wikipedia

  • Rough fuzzy hybridization — is a method of hybrid intelligent system or soft computing, where Fuzzy set theory is used for linguistic representation of patterns, leading to a fuzzy granulation of the feature space. Rough set theory is used to obtain dependency rules which… …   Wikipedia

  • Set (mathematics) — This article gives an introduction to what mathematicians call intuitive or naive set theory; for a more detailed account see Naive set theory. For a rigorous modern axiomatic treatment of sets, see Set theory. The intersection of two sets is… …   Wikipedia

  • Set theory — This article is about the branch of mathematics. For musical set theory, see Set theory (music). A Venn diagram illustrating the intersection of two sets. Set theory is the branch of mathematics that studies sets, which are collections of objects …   Wikipedia

  • Rough Riders (disambiguation) — Rough Riders refers to the 1st U.S. Volunteer Cavalry regiment during the Spanish American WarRough Riders may also refer to:;Other military units: * The City of London Yeomanry (Rough Riders), a regiment of the British Territorial Army * Rough… …   Wikipedia

  • Rough Science — is a UK factual television series made by the BBC in collaboration with the Open University and has, as of 2005, had six series. It is broadcast in prime time on BBC Two and is considered something of a break out hit for the Open University.… …   Wikipedia

  • Rough Guides — Ltd is a guidebook and reference publisher, owned by Pearson PLC. Their travel titles cover more than 200 destinations, and are distributed worldwide through the Penguin Group. The series began with the 1982 Rough Guide to Greece , a book… …   Wikipedia

  • Rough Trade Records — Saltar a navegación, búsqueda Rough Trade Fundada 1978 Fundador(es) Geoff Travis Jeanette Lee (Socio)[1] Género(s) Post punk, Rock alternativo …   Wikipedia Español

  • Rough for Radio II — is a radio play by Samuel Beckett. It was written in French in 1961 as Pochade radiophonique and published in Minuit 16, November 1975. Beckett translated the work into English shortly before its broadcast on BBC Radio 3 on 13th April 1976.… …   Wikipedia