Data mining in meteorology


Data mining in meteorology

Meteorology is the interdisciplinary scientific study of the atmosphere. It observes the changes in temperature, air pressure, moisture and wind direction. Usually, temperature, pressure, wind measurements and humidity are the variables that are measured by a thermometer, barometer, anemometer, and hygrometer, respectively. There are many methods of collecting data and Radar, Lidar, satellites are some of them.

Weather forecasts are made by collecting quantitative data about the current state of the atmosphere. The main issue arise in this prediction is, it involves high-dimensional characters. To overcome this issue, it is necessary to first analyze and simplify the data before proceeding with other analysis. Some data mining techniques are appropriate in this context.

Contents

What is Data mining?

Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to analyze important information in data warehouses. Consequently, data mining consists of more than collecting and analyzing data, it also includes analyze and predictions. The tools which are use to analysis can include statistical models, mathematical algorithms and machine learning methods. These methods include algorithms that improve their performance automatically through experience, such as neural networks or decision trees[1]

. The network architecture and signal process used to model nervous systems can roughly be divided into three categories, each based on a different philosophy.

  1. Feedforward neural network: the input information defines the initial signals into set of output signals.[2]
  2. Feedback network: the input information defines the initial activity state of a feedback system, and after state transitions, the asymptotic final state is identified as the outcome of the computation.[3]
  3. Neighboring cells in a neural network compete in their activities by means of mutual lateral interactions, and develop adaptively into specific detectors of different signal patterns. In this category, learning is called competitive, unsupervised learning or self-organizing.[4]

Self-organizing Maps

Self-Organizing Map (SOM) is one of the most popular neural network models, which is especially suitable for high dimensional data visualization, clustering and modeling. It uses an unsupervised learning for creating a set of prototype vectors representing the data. The SOM was introduced to meteorological and climatic sciences in late 1990s as a clustering and pattern recognition method.[5] Nowadays, Self-Organized maps have been applied in several meteorological problems, such as classifying climate modes, cloud classification,[6] classification of TEMP data,[7] extreme weather and rainfall pattern analysis.
The Self-Organizing Map projects high-dimensional input data onto a low dimensional (usually two-dimensional) space.[8] Because it preserves the neighborhood relations of the input data, the SOM is a topology-preserving technique. There are many types of topologies used in SOM: grid, hexagonal, random are some of them.[9] The output neurons are arranged according to the given topology. The distances between neurons are calculated using a distance function.[10] There are several distance functions which can be used such as Euclidean distance, box distance, link distance and Manhattan distance.
According to the first input of the input vector, System chooses the output neuron (winning neuron) that closely matches with the given input vector. Then determining a neighborhood of excited neurons around the winner; and finally, updating all of the excited neurons. It must select the neighborhood function that permits to calculate the nodes “nearest” to the winner.[11] Some neighborhood functions are the Gaussian, the Bubble and the EP.[12] The outcome weight vectors of the SOM nodes are reshaped back to have characteristic data patterns. This learning procedure leads to a topologically ordered mapping of the input data. Similar patterns are mapped onto neighboring regions on the map, while dissimilar patterns are located further apart.

References

  1. ^ Seifert, W. (2004). "Data Mining:An Overview". CRS. 
  2. ^ Kohonen, T. (2002 ).. "The Self-Organizing Map.". IEEE: pp. 1464–1480.. 
  3. ^ Kohonen, T. (2002 ).. "The Self-Organizing Map.". IEEE: pp. 1464–1480.. 
  4. ^ Liu, Y., & Weisberg, R. H.. "A Review of Self-Organizing Map Applications in Meteorology and Oceanography". Self Organizing Maps - Applications and Novel Algorithm Design .: pp. 2011.. 
  5. ^ COFIÑO, A., GUTIÉRREZ, J., JAKUBIAK, B., & MELONEK, M. (2003). "IMPLEMENTATION OF DATA MINING TECHNIQUES FOR METEOROLOGICAL APPLICATIONS". World Scientific: pp. 215–240. 
  6. ^ Hong Y., HSU, K., SOROOSHIAN, S., & GAO, X. . (2004). "Precipitation Estimation from Remotely Sensed Imagery Using an Artificial Neural.". JOURNAL OF APPLIED METEOROLOGY 43: pp. 1834–1852.. 
  7. ^ Lahoz, D., & Miguel, M. S. (2004.). "CLASSIFICATION TEMP DATA WITH SELF-ORGANIZING MAPS.". Monografías del Seminario Matemático García de Galdeano: pp. 389–397.. 
  8. ^ Liu, Y., & Weisberg, R. H.. "A Review of Self-Organizing Map Applications in Meteorology and Oceanography". Self Organizing Maps - Applications and Novel Algorithm Design .: pp. 2011.. 
  9. ^ Lahoz, D., & Miguel, M. S. (2004.). "CLASSIFICATION TEMP DATA WITH SELF-ORGANIZING MAPS.". Monografías del Seminario Matemático García de Galdeano: pp. 389–397.. 
  10. ^ Lahoz, D., & Miguel, M. S. (2004.). "CLASSIFICATION TEMP DATA WITH SELF-ORGANIZING MAPS.". Monografías del Seminario Matemático García de Galdeano: pp. 389–397.. 
  11. ^ Lahoz, D., & Miguel, M. S. (2004.). "CLASSIFICATION TEMP DATA WITH SELF-ORGANIZING MAPS.". Monografías del Seminario Matemático García de Galdeano: pp. 389–397.. 
  12. ^ Lahoz, D., & Miguel, M. S. (2004.). "CLASSIFICATION TEMP DATA WITH SELF-ORGANIZING MAPS.". Monografías del Seminario Matemático García de Galdeano: pp. 389–397.. 

External links


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Data mining — Not to be confused with analytics, information extraction, or data analysis. Data mining (the analysis step of the knowledge discovery in databases process,[1] or KDD), a relatively young and interdisciplinary field of computer science[2][3] is… …   Wikipedia

  • Weather Data Mining — is a form of Data mining concerned with finding the hidden patterns out of the Large available meteorological data, so that the information retrieved can be transformed into the usable knowledge. Variety of data mining tool and techniques are… …   Wikipedia

  • Data dredging — (data fishing, data snooping) is the inappropriate (sometimes deliberately so) use of data mining to uncover misleading relationships in data. Data snooping bias is a form of statistical bias that arises from this misuse of statistics. Any… …   Wikipedia

  • Multifactor dimensionality reduction — (MDR) is a data mining approach for detecting and characterizing combinations of attributes or independent variables that interact to influence a dependent or class variable. MDR was designed specifically to identify interactions among discrete… …   Wikipedia

  • Outline of science — The following outline is provided as an overview of and topical guide to science: Science – in the broadest sense refers to any system of objective knowledge. In a more restricted sense, science refers to a system of acquiring knowledge based on… …   Wikipedia

  • Список журналов издательства Springer — Содержание 1 Биомедицина и науки о жизни (Biomedical and Life Sciences) 2 З …   Википедия

  • Mountain Wave Project — Logo Mountain Wave Project The Mountain Wave Project (MWP) pursues global scientific research of gravity waves and associated turbulence. MWP seeks to develop new scientific insights and knowledge through high altitude and record seeking glider… …   Wikipedia

  • List of academic disciplines — An academic discipline, or field of study, is a branch of knowledge that is taught and researched at the college or university level. Disciplines are defined (in part), and recognized by the academic journals in which research is published, and… …   Wikipedia

  • Principal component analysis — PCA of a multivariate Gaussian distribution centered at (1,3) with a standard deviation of 3 in roughly the (0.878, 0.478) direction and of 1 in the orthogonal direction. The vectors shown are the eigenvectors of the covariance matrix scaled by… …   Wikipedia

  • Chengdu University of Information Technology — 成都信息工程学院 Motto 成于大气,信达天下 Established 1951 Type Public …   Wikipedia