Sequence mining


Sequence mining

Sequence mining is concerned with finding statistically relevant patterns between data examples where the values are delivered in a sequence. It is usually presumed that the values are discrete, and thus Time series mining is closely related, but usually considered a different activity. Sequence mining is a special case of structured data mining.

There are two different kinds of sequence mining: "string mining" and "itemset mining". String mining is widely used in biology, to examine gene and protein sequences, and is primarily concerned with sequences with a single member at each position. There exist a variety of prominent algorithms to perform alignment of a query sequence with those existing in databases. The kind of alignment could either involve matching a query with one subject e.g. BLAST or matching multiple query sets with each other e.g. ClustalW. Itemset mining is used more often in marketing and CRM applications, and is concerned with multiple-symbols at each position. Itemset mining is also a popular approach to text mining.

There are several key problems within this field. These include building efficient databases and indexes for sequence information, extracting the frequently occurring patterns, comparing sequences for similarity, and recovering missing sequence members.

Two common techniques that are applied to sequence databases for frequent itemset mining are the influential apriori algorithm and the more-recent FP-Growth technique. However, there is nothing in these techniques that restricts them to sequences, per se.

See also

* GSP Algorithm


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Mining engineering — Surface coal mine with haul truck in foreground Mining engineering is an engineering discipline that involves the practice, the theory, the science, the technology, and application of extracting and processing minerals from a naturally occurring… …   Wikipedia

  • mining — /muy ning/, n. 1. the act, process, or industry of extracting ores, coal, etc., from mines. 2. the laying of explosive mines. [1250 1300; ME: undermining (walls in an attack); see MINE2, ING1] * * * I Excavation of materials from the Earth s… …   Universalium

  • Mining — This article is about the extraction of geological materials from the Earth. For the municipality in Austria, see Mining, Austria. For the siege tactic, see Mining (military). For name of the Chinese emperor, see Daoguang Emperor. Simplified… …   Wikipedia

  • Structure mining — or structured data mining is the process of finding and extracting useful information from semi structured data sets. Graph mining is a special case of structured data mining[citation needed]. Contents 1 Description 2 See also …   Wikipedia

  • coal mining — Coal was very important in the economic development of Britain. It was used as fuel in the factories built during the Industrial Revolution and continued to be important until the 1980s. The main coalfields are in north east England, the north… …   Universalium

  • Data mining — Not to be confused with analytics, information extraction, or data analysis. Data mining (the analysis step of the knowledge discovery in databases process,[1] or KDD), a relatively young and interdisciplinary field of computer science[2][3] is… …   Wikipedia

  • Gold mining in Alaska — Gold mining in Alaska, a state of the United States, has been a major industry and impetus for exploration and settlement since a few years after the United States acquired the territory from Russia. Russian explorers discovered placer gold in… …   Wikipedia

  • 2010 Copiapó mining accident — Copiapó mining accident redirects here. For the 2006 accident in the Carola Agustina mine, see 2006 Copiapó mining accident. 2010 Copiapó mining accident Rescue efforts at San José Mine near Copiapó, Chile on 10 August 2010 Date …   Wikipedia

  • Copper mining in Michigan — While it originated thousands of years earlier, copper mining in Michigan became an important industry in the 19th and early 20th centuries. Its rise marked the start of copper mining as a major industry in the United States. Contents 1 Geology 2 …   Wikipedia

  • Dartmoor tin-mining — The wheelpit at Huntingdon mine The Dartmoor tin mining industry is thought to have originated in pre Roman times,[1] and continued right through to the 20th century. From the 12th century onwards tin mining was regulated by a Stannary Parliament …   Wikipedia