- Data architecture
Data Architecture in enterprise architecture is the design of data for use in defining the target state and the subsequent planning needed to achieve the target state. It is usually one of several architecture domains that form the pillars of an enterprise architecture or solution architecture.
A data architecture describes the data structures used by a business and/or its applications. There are descriptions of data in storage and data in motion; descriptions of data stores, data groups and data items; and mappings of those data artifacts to data qualities, applications, locations etc.
Essential to realizing the target state, Data Architecture describes how data is processed, stored, and utilized in a given system. It provides criteria for data processing operations that make it possible to design data flows and also control the flow of data in the system.
The Data Architect is responsible for defining the target state, alignment during development and then minor follow up to ensure enhancements are done in the spirit of the original blueprint.
During the definition of the target state, the Data Architecture breaks a subject down to the atomic level and then builds it back up to the desired form. The Data Architect breaks the subject down by going through 3 traditional architectural processes:
- Conceptual - represents all business entities.
- Logical - represents the logic of how entities are related.
- Physical - the realization of the data mechanisms for a specific type of functionality.
The "data" column of the Zachman Framework for enterprise architecture –
Layer View Data (What) Stakeholder 1 Scope/Contextual List of things important to the business (subject areas) Planner 2 Business Model/Conceptual Semantic model or Conceptual/Enterprise Data Model Owner 3 System Model/Logical Enterprise/Logical Data Model Designer 4 Technology Model/Physical Physical Data Model Builder 5 Detailed Representations/ out-of-context Data Definition Subcontractor
In this second, broader sense, data architecture includes a complete analysis of the relationships between an organization's functions, available technologies, and data types.
Data architecture should be defined in the planning phase of the design of a new data processing and storage system. The major types and sources of data necessary to support an enterprise should be identified in a manner that is complete, consistent, and understandable. The primary requirement at this stage is to define all of the relevant data entities, not to specify computer hardware items. A data entity is any real or abstracted thing about which an organization or individual wishes to store data.
Data Architecture Topics
Physical data architecture
Physical data architecture of an information system is part of a technology plan. As its name implies, the technology plan is focused on the actual tangible elements to be used in the implementation of the data architecture design. Physical data architecture encompasses database architecture. Database architecture is a schema of the actual database technology that will support the designed data architecture.
Elements of data architecture
There are certain elements that must be defined as the data architecture schema of an organization is designed. For example, the administrative structure that will be established in order to manage the data resources must be described. Also, the methodologies that will be employed to store the data must be defined. In addition, a description of the database technology to be employed must be generated, as well as a description of the processes that will manipulate the data. It is also important to design interfaces to the data by other systems, as well as a design for the infrastructure that will support common data operations (i.e. emergency procedures, data imports, data backups, external transfers of data).
Without the guidance of a properly implemented data architecture design, common data operations might be implemented in different ways, rendering it difficult to understand and control the flow of data within such systems. This sort of fragmentation is highly undesirable due to the potential increased cost, and the data disconnects involved. These sorts of difficulties may be encountered with rapidly growing enterprises and also enterprises that service different lines of business (e.g. insurance products).
Properly executed, the data architecture phase of information system planning forces an organization to specify and delineate both internal and external information flows. These are patterns that the organization may not have previously taken the time to conceptualize. It is therefore possible at this stage to identify costly information shortfalls, disconnects between departments, and disconnects between organizational systems that may not have been evident before the data architecture analysis.
Constraints and influences
Various constraints and influences will have an effect on data architecture design. These include enterprise requirements, technology drivers, economics, business policies and data processing needs.
- Enterprise requirements
- These will generally include such elements as economical and effective system expansion, acceptable performance levels (especially system access speed), transaction reliability, and transparent management of data. In addition, the conversion of raw data such as transaction records and image files into more useful information forms through such features as data warehouses is also a common organizational requirement, since this enables managerial decision making and other organizational processes. One of the architecture techniques is the split between managing transaction data and (master) reference data. Another one is splitting data capture systems from data retrieval systems (as done in a data warehouse).
- Technology drivers
- These are usually suggested by the completed data architecture and database architecture designs. In addition, some technology drivers will derive from existing organizational integration frameworks and standards, organizational economics, and existing site resources (e.g. previously purchased software licensing).
- These are also important factors that must be considered during the data architecture phase. It is possible that some solutions, while optimal in principle, may not be potential candidates due to their cost. External factors such as the business cycle, interest rates, market conditions, and legal considerations could all have an effect on decisions relevant to data architecture.
- Business policies
- Business policies that also drive data architecture design include internal organizational policies, rules of regulatory bodies, professional standards, and applicable governmental laws that can vary by applicable agency. These policies and rules will help describe the manner in which enterprise wishes to process their data.
- Data processing needs
- These include accurate and reproducible transactions performed in high volumes, data warehousing for the support of management information systems (and potential data mining), repetitive periodic reporting, ad hoc reporting, and support of various organizational initiatives as required (i.e. annual budgets, new product development).
- Enterprise Information Security Architecture - (EISA) positions data security in the enterprise information framework.
- FDIC Enterprise Architecture Framework
- Controlled vocabulary
- Bass, L.; John, B.; & Kates, J. (2001). Achieving Usability Through Software Architecture, Carnegie Mellon University.
- Lewis, G.; Comella-Dorda, S.; Place, P.; Plakosh, D.; & Seacord, R., (2001). Enterprise Information System Data Architecture Guide Carnegie Mellon University.
- Adleman, S.; Moss, L.; Abai, M. (2005). Data Strategy Addison-Wesley Professional.
Wikimedia Foundation. 2010.
Look at other dictionaries:
Data management — comprises all the disciplines related to managing data as a valuable resource. Contents 1 Overview 2 Topics in Data Management 3 Body Of Knowledge 4 Usage … Wikipedia
Data administration — or data resource management is an organizational function working in the areas of information systems and computer science that plans, organizes, describes and controls data resources. Data resources are usually as stored in databases under a… … Wikipedia
Data independence — is the type of data transparency that matters for a centralized DBMS. It refers to the immunity of user applications to make changes in the definition and organization of data. Physical data independence deals with hiding the details of the… … Wikipedia
Data integrity — in its broadest meaning refers to the trustworthiness of system resources over their entire life cycle. In more analytic terms, it is the representational faithfulness of information to the true state of the object that the information represents … Wikipedia
Data model — Overview of data modeling context: A data model provides the details of information to be stored, and is of primary use when the final product is the generation of computer software code for an application or the preparation of a functional… … Wikipedia
Data architect — A data architect is a person responsible for ensuring that the data assets of an organization are supported by an architecture supporting the organization in achieving its strategic goals. The architecture should cover databases, data integration … Wikipedia
Architecture domain — An architecture domain is a broad view of an enterprise or system, one of the pillars of enterprise architecture or solution architecture. It is a partial representation of a whole system that addresses several concerns of several stakeholders.… … Wikipedia
Architecture Multimodale et Interfaces — est un standard ouvert en développement par le World Wide Consortium depuis 2005. Actuellement il est considéré comme brouillon de travail (Working Draft) du W3C. Le document est le rapport technique de spécification d une architecture… … Wikipédia en Français
Data warehouse — Overview In computing, a data warehouse (DW) is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations… … Wikipedia
Data Intensive Computing — is a class of parallel computing applications which use a data parallel approach to processing large volumes of data typically terabytes or petabytes in size and typically referred to as Big Data. Computing applications which devote most of their … Wikipedia