Denormalization

In computing, denormalization is the process of attempting to optimise the read performance of a database by adding redundant data or by grouping data.[1][2] In some cases, denormalisation helps cover up the inefficiencies inherent in relational database software. A relational normalised database imposes a heavy access load over physical storage of data even if it is well tuned for high performance.

A normalised design will often store different but related pieces of information in separate logical tables (called relations). If these relations are stored physically as separate disk files, completing a database query that draws information from several relations (a join operation) can be slow. If many relations are joined, it may be prohibitively slow. There are two strategies for dealing with this. The preferred method is to keep the logical design normalised, but allow the database management system (DBMS) to store additional redundant information on disk to optimise query response. In this case it is the DBMS software's responsibility to ensure that any redundant copies are kept consistent. This method is often implemented in SQL as indexed views (Microsoft SQL Server) or materialised views (Oracle). A view represents information in a format convenient for querying, and the index ensures that queries against the view are optimised.

The more usual approach is to denormalise the logical data design. With care this can achieve a similar improvement in query response, but at a cost—it is now the database designer's responsibility to ensure that the denormalised database does not become inconsistent. This is done by creating rules in the database called constraints, that specify how the redundant copies of information must be kept synchronised. It is the increase in logical complexity of the database design and the added complexity of the additional constraints that make this approach hazardous. Moreover, constraints introduce a trade-off, speeding up reads (SELECT in SQL) while slowing down writes (INSERT, UPDATE, and DELETE). This means a denormalised database under heavy write load may actually offer worse performance than its functionally equivalent normalised counterpart.

A denormalised data model is not the same as a data model that has not been normalised, and denormalisation should only take place after a satisfactory level of normalisation has taken place and that any required constraints and/or rules have been created to deal with the inherent anomalies in the design. For example, all the relations are in third normal form and any relations with join and multi-valued dependencies are handled appropriately.

Examples of denormalisation techniques include:

  • Materialised views, which may implement the following:
    • Storing the count of the "many" objects in a one-to-many relationship as an attribute of the "one" relation
    • Adding attributes to a relation from another relation with which it will be joined
  • Star schemas, which are also known as fact-dimension models and have been extended to snowflake schemas
  • Prebuilt summarisation or OLAP cubes

Denormalisation techniques are often used to improve the scalability of Web applications.[3]

References

  1. ^ G. L. Sanders and S. K. Shin. Denormalisation effects on performance of RDBMS. In Proceedings of the HICSS Conference, January 2001.
  2. ^ S. K. Shin and G. L. Sanders. Denormalisation strategies for data retrieval from data warehouses. Decision Support Systems, 42(1):267-282, October 2006.
  3. ^ Z. Wei, J. Dejun, G. Pierre, C.-H. Chi and M. van Steen. Service-Oriented Data Denormalisation for Scalable Web Applications. In Proceedings of the International World-Wide Web conference, April 2008.

See also


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • denormalization — n. * * * …   Universalium

  • denormalization — noun The act or process of denormalizing …   Wiktionary

  • denormalization — n …   Useful english dictionary

  • Database normalization — In the design of a relational database management system (RDBMS), the process of organizing data to minimize redundancy is called normalization. The goal of database normalization is to decompose relations with anomalies in order to produce… …   Wikipedia

  • Denormal number — In computer science, denormal numbers or denormalized numbers (now often called subnormal numbers) fill the underflow gap around zero in floating point arithmetic: any non zero number which is smaller than the smallest normal number is sub normal …   Wikipedia

  • Floating point — In computing, floating point describes a method of representing real numbers in a way that can support a wide range of values. Numbers are, in general, represented approximately to a fixed number of significant digits and scaled using an exponent …   Wikipedia

  • Functional dependency — A functional dependency (FD) is a constraint between two sets of attributes in a relation from a database.Given a relation R , a set of attributes X in R is said to functionally determine another attribute Y , also in R , (written X → Y ) if and… …   Wikipedia

  • ADABAS — (acronym for Adaptable DAta BAse System [Harvnb | Pratt | Adamski | 1987 | p=471 | Ref=Pratt Adamski1987 ] ) is Software AG’s primary database management system. History First released in 1970, ADABAS is considered by some to have been one of the …   Wikipedia

  • Database design — is the process of producing a detailed data model of a database. This logical data model contains all the needed logical and physical design choices and physical storage parameters needed to generate a design in a Data Definition Language, which… …   Wikipedia

  • Object-relational impedance mismatch — The object relational impedance mismatch is a set of conceptual and technical difficulties that are often encountered when a relational database management system (RDBMS) is being used by a program written in an object oriented programming… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”