IBM General Parallel File System

IBM General Parallel File System

name = IBM GPFS

caption =
developer = IBM
latest_release_version = 3.2.1-6
latest_release_date = September 2008
operating_system = AIX / Linux / Microsoft Windows Server 2003 R2
genre = filesystem
license = Proprietary
website = []

[|GPFS (General Parallel File System)] is a high-performance shared-disk clustered file system developed by IBM. It is used by many of the supercomputers that populate the Top 500 List, which enumerates the 500 most powerful supercomputers in the world.cite conference
first = Frank
last = Schmuck
coauthors = Roger Haskin
title = GPFS: A Shared-Disk File System for Large Computing Clusters
booktitle = Proceedings of the FAST'02 Conference on File and Storage Technologies
pages = 231-244
publisher = USENIX
date = January 2002
location = Monterey, California, USA
url =
format = pdf
isbn = 1-880446-03-0
accessdate = 2008-01-18
] For example, GPFS is the filesystem of the ASC Purple Supercomputer,cite web
title = Storage Systems - Projects - GPFS
publisher = IBM
url =
format = html
accessdate = 2008-06-18
] which is composed of more than 12,000 processors and has 2 petabytes of total disk storage spanning more than 11,000 disks.

Like some other cluster filesystems, GPFS provides concurrent high-speed file access to applications executing on multiple nodes of clusters. It can be used with AIX 5L clusters, Linux clusters, on Microsoft Windows Server 2003 R2 or a heterogeneous cluster of AIX, Linux and Windows nodes. In addition to providing filesystem storage capabilities, GPFS provides tools for management and administration of the GPFS cluster and allows for shared access to file systems from remote GPFS clusters.

GPFS has been available on AIX since 1998, on Linux since 2001 and on Microsoft Windows Server 2003 R2 (64-Bit) since 2008, and is offered as part of the IBM System Cluster 1350.


GPFS began as the [ Tiger Shark file system] , a research project at IBM's Almaden Research Center as early as 1993. Shark was initially designed to support high throughput multimedia applications. This design turned out to be well suited to scientific computing.cite book
last = May
first = John M.
title = Parallel I/O for High Performance Computing
publisher = Morgan Kaufmann
date = 2000
pages = p. 92
url =,M1
isbn = 1558606645
accessdate = 2008-06-18

Another ancestor of GPFS is IBM's Vesta filesystem, developed as a research project at IBM's Thomas J. Watson Research Center between 1992-1995.Citation
first1 = Peter F.
last1 = Corbett
first2 = Dror G.
last2 = Feitelson
first3 = J.-P.
last3 = Prost
first4 = S. J.
last4 = Baylor
contribution = Parallel access to files in the Vesta file system
title = Supercomputing
pages = 472-481
publisher = ACM/IEEE
location = Portland, Oregon, United States
date = 1993
doi = 10.1145/169627.169786
accessdate = 2008-06-18
] Vesta introduced the concept of file partitioning to accommodate the needs of parallel applications that run on high-performance multicomputers with parallel I/O subsystems. With partitioning, a file is not a sequence of bytes, but rather multiple disjoint sequences that may be accessed in parallel. The partitioning is such that it abstracts away the number and type of I/O nodes hosting the filesystem, and it allows a variety of logical partitioned views of files, regardless of the physical distribution of data within the I/O nodes. The disjoint sequences are arranged to correspond to individual processes of a parallel application, allowing for improved scalability.Citation
first1 = Peter F.
last1 = Corbett
first2 = Dror G.
last2 = Feitelson
title = The Vesta parallel file system
journal = Transactions on Computer Systems
volume = 14
issue = 3
pages = pp. 225-264
publisher = ACM
url =
format = pdf
date = August 1996
doi = 10.1145/233557.233558
accessdate = 2008-06-18

Vesta was commercialized as the PIOFS filesystem around 1994,cite journal
last = Corbett
first = P. F.
coauthors = D. G. Feitelson, J.-P. Prost, G. S. Almasi, S. J. Baylor, A. S. Bolmarcich, Y. Hsu, J. Satran, M. Snir, R. Colao, B. D. Herr, J. Kavaky, T. R. Morgan, and A. Zlotek
title = Parallel file systems for the IBM SP computers
journal = IBM System Journal
volume = 34
issue = 2
pages = 222-248
date = 1995
url =
format = pdf
accessdate = 2008-06-18
] and was succeeded by GPFS around 1998.cite book
first = Marcelo
last = Barris
coauthors = Terry Jones, Scott Kinnane, Mathis LandzettelSafran Al-Safran, Jerry Stevens, Christopher Stone, Chris Thomas, Ulf Troppens
title = Sizing and Tuning GPFS
publisher = IBM Redbooks, International Technical Support Organization
date = September 1999
location =
pages = see page 1 ("GPFS is the successor to the PIOFS file system")
url =
format = pdf
] cite web
last = Snir
first = Marc
title = Scalable parallel systems: Contributions 1990-2000
publisher = HPC seminar, Computer Architecture Department, Universitat Politècnica de Catalunya
date = June 2001
url =
format = pdf
accessdate = 2008-06-18
] The main difference between the older and newer filesystems was that GPFS replaced the specialized interface offered by Vesta/PIOFS with the standard Unix API: all the features to support high performance parallel I/O were hidden from users and implemented under the hood. Today, GPFS is used by many of the top 500 supercomputers listed on the [ Top 500 Supercomputing Sites] web site. Since inception GPFS has been successfully deployed for many commercial applications including: digital media, grid analytics and scalable file service.

* GPFS 3.2, September 2007
** GPFS 3.2.1-2, April 2008
** GPFS 3.2.1-4, July 2008
** GPFS 3.2.1-6, September 2008
* GPFS 3.1
* GPFS 2.3.0-29


GPFS provides high performance by allowing data to be accessed over multiple computers at once. Most existing file systems are designed for a single server environment, and adding more file servers does not improve performance. GPFS provides higher input/output performance by "striping" blocks of data from individual files over multiple disks, and reading and writing these blocks in parallel. Other features provided by GPFS include high availability, support for heterogeneous clusters, disaster recovery, security, DMAPI, HSM and ILM.

Information Lifecycle Management (ILM) Tools

GPFS is designed to help achieve data lifecycle management efficiencies through policy-driven automation and tiered storage management. Storage pools, filesets and user-defined policies provide the ability to better match the cost of storage resources to the value of your data.

Storage pools allow for the grouping of disks within a file system. Tiers of storage can be created by grouping disks based on performance, locality or reliability characteristics. For example, one pool could be high performance fibre channel disks and another more economical SATA storage.

A fileset is a sub-tree of the file system namespace and provides a way to partition the namespace into smaller, more manageable units. Filesets provide an administrative boundary that can be used to set quotas and be specified in a policy to control initial data placement or data migration. Data in a single fileset can reside in one or more storage pools. Where the file data resides and how it is migrated is based on a set of rules in a user defined policy.

There are two types of user defined policies in GPFS: File placement and File management. File placement policies direct file data as files are created to the appropriate storage pool. File placement rules are determined by attributes such as file name, the user name or the fileset. File management policies allow the file's data to be moved or replicated or files deleted. File management policies can be used to move data from one pool to another without changing the file's location in the directory structure. File management policies are determined by file attributes such as last access time, path name or size of the file.

The GPFS policy processing engine is scalable and can be run on many nodes at once. This allows management policies to be applied to a single file system with billions of files and complete in a few hours.

Related articles

* Scale-out File Services IBM's NAS-grid solution using GPFS

See also

* List of file systems
* Shared disk file system


External links

* [ GPFS official homepage]
* [ GPFS at Almaden]
* [ GPFS at SourceForge]
* [ Tiger Shark File System]
* [ GPFS Mailing List]
* [ SNMP-based monitoring for GPFS clusters] , IBM developerworks, 2007
* [ Introduction to GPFS Version 3.2] , IBM, September 2007.

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • General Parallel File System — IBM GPFS Тип Файловая система Разработчик IBM Операционная система AIX / Linux / Windows Последняя версия (Декабрь 2009) Лицензия Проприетарное программное обеспечение …   Википедия

  • General Parallel File System — GPFS Desarrollador(a) IBM Nombre completo General Parallel File System Introducido 1998 (AIX) Identificador de la partición Estructuras Conteni …   Wikipedia Español

  • General Parallel File System — Le General Parallel File System (GPFS) est un système de fichiers conçu pour adresser de façon unique des volumes de données dépassant le pétaoctet et répartis sur un nombre de supports physiques pouvant dépasser le millier. Conçu par IBM qui le… …   Wikipédia en Français

  • IBM SAN File System — Infobox Software name = IBM SAN File System caption = developer = IBM latest release version = 2.2.3 127 latest release date = March 2007 operating system = Linux (server) AIX, Linux, Solaris, Windows and Windows and Linux VMware guests (Client)… …   Wikipedia

  • Veritas File System — For other uses, see Veritas (disambiguation). VERITAS File System Full name VERITAS File System Introduced 1991 Structures Directory contents extensible hash Limits Max file size 8 EB ( …   Wikipedia

  • Clustered file system — A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. While many computer clusters don t use… …   Wikipedia

  • Shared disk file system — A shared disk file system, also known as clustered file system or SAN file system, is an enterprise storage file system which can be shared (concurrently accessed for reading and writing) by multiple computers. Such devices are usually clustered… …   Wikipedia

  • Be File System — BFS Developer Be Inc. Full name Be File System Introduced May 10, 1997 (BeOS Advanced Access Preview Release[1]) Partition identifier Be BFS (Apple Partition Map) 0xEB (MBR) …   Wikipedia

  • MINIX file system — Developer Open Source Community Full name MINIX file system version 3 Introduced 1987 (MINIX 1.0) Partition identifier 0x81 (MBR) Features Dates recorded …   Wikipedia

  • Coda (file system) — Coda Developer Carnegie Mellon University Introduced 1987 Features Supported operating systems Linux, NetBSD FreeBSD Coda is a distributed file system developed as a research project at Carnegie Mellon University since 19 …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”

We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.