- Text simplification
Text simplification is an operation used in
natural language processingto modify, enhance, classify or otherwise process an existing corpus of human-readable text in such a way that the grammar and structure of the prose is greatly simplified, while the underlying meaning and informationremains the same. Text simplification is an important area of research, because natural human languages ordinarily contain complex compound constructions that are not easily processed through automation.
Text Simplification is illustrated with an example. The first sentence contains two relative clauses and one conjoined verb phrase. A text simplification system aims to simplify the first sentence to the second sentence.
* "Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents, which precedes the full purchasing agents report that is due out today and gives an indication of what the full report might hold."
* "Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents. The Chicago report precedes the full purchasing agents report. The Chicago report gives an indication of what the full report might hold. The full report is due out today."
Controlled natural language
* [http://www1.cs.columbia.edu/~as372/LEC02.pdf An Architecture for a Text Simplification System]
* [http://repository.upenn.edu/cgi/viewcontent.cgi?article=1110&context=ircs_reports Automatic Induction of Rules for Text Simplification]
* [http://www.isi.edu/~marcu/papers/factoids04.pdf Text Simplification for Information-Seeking Applications]
Wikimedia Foundation. 2010.
Look at other dictionaries:
Simplification de textes — La simplification de textes(TS) est une opération utilisée dedans en traitement automatique du langage naturel pour modifier, augmenter, classifier ou traiter autrement un corpus existant de texte lisible pour l homme de telle manière que la… … Wikipédia en Français
Rongorongo text K — Text K of the rongorongo corpus, also known as the (Small) London tablet, is one of two dozen surviving rongorongo texts, and nearly duplicates the recto of tablet G. Other namesK is the standard designation, from Barthel (1958). Fischer (1997)… … Wikipedia
Ambiguities in Chinese character simplification — Main article: Simplified Chinese characters A relatively small number of Chinese characters known as (Chinese (PRC)): 简繁一对多; (Chinese (Taiwan)): 簡繁一對多 do not have a one to one mapping between their simplified and traditional forms. This is… … Wikipedia
Natural language processing — (NLP) is a field of computer science and linguistics concerned with the interactions between computers and human (natural) languages; it began as a branch of artificial intelligence. In theory, natural language processing is a very attractive… … Wikipedia
Ontology learning — (ontology extraction, ontology generation, or ontology acquisition) is a subtask of information extraction. The goal of ontology learning is to semi automatically extract relevant concepts and relations from a given corpus or other kinds of data… … Wikipedia
Information extraction — In natural language processing, information extraction (IE) is a type of information retrieval whose goal is to automatically extract structured information, i.e. categorized and contextually and semantically well defined data from a certain… … Wikipedia
Semantic gap — The semantic gap characterizes the difference between two descriptions of an object by different linguistic representations, for instance languages or symbols. In computer science, the concept is relevant whenever ordinary human activities,… … Wikipedia
Terminology extraction — Terminology extraction, term extraction, or glossary extraction, is a subtask of information extraction. The goal of terminology extraction is to automatically extract relevant terms from a given corpus.In the semantic web era, a growing number… … Wikipedia
Chinese characters — Unless otherwise specified Chinese text in this article is written in the format (Simplified Chinese / Traditional Chinese; Pinyin). In cases where the Simplified and Traditional Chinese characters are identical, the Chinese term is written only… … Wikipedia
Approximant de Padé — Le concept de l article doit son nom à Henri Padé (1863 1953) un mathématicien français. En mathématiques, et plus précisément en analyse complexe, l approximant de Padé est une méthode d approximation d une fonction analytique par une fonction… … Wikipédia en Français