Moose File System

Moose File System
Moose File System
Moose FS logo wiki.png
Developer(s) Gemius SA
Stable release 1.6.20 / January 17, 2011; 9 months ago (2011-01-17)
Operating system Linux, FreeBSD, Solaris, OpenSolaris, Mac OS X
Type Distributed file system
License GNU General Public License v3
Website www.moosefs.org

Moose File System (MooseFS) is a distributed file system developed by Gemius SA. The lead developer is Jakub Kruszona-Zawadzki. MooseFS aims to be fault-tolerant, scalable, POSIX compliant, general-purpose file system for datacenters. Initially proprietary code has been open-sourced and released to public on May 5, 2008.

Contents

Design

The MooseFS mostly follows the same design principles as Google File System, Lustre or Ceph. The file system comprises three components:

  • Metadata server (MDS) — manages the location (layout) of files, file access and namespace (hierarchy). The current version of MooseFS does not support multiple metadata servers nor failover. The MDS thus presents a single point of failure. Clients only talk to the MDS to retrieve/update a file's layout and attributes; the data itself is transferred directly between clients and chunk servers. The Metadata server is a user-space daemon, the metadata is kept in memory and lazily stored on local disk.
  • Metalogger server — periodically pulls the metadata from the MDS to store it for backup. New since version 1.6.5, this is an optional feature. Eventually it will be possible to turn the metalogger server into a failover MDS by using CARP.
  • Chunk servers (CSS) — store the data and optionally replicate it among themselves. There can be many of them, though the scalability limit has not been published. The biggest cluster reported so far consists of 75 servers. [1]The Chunk server is also a user-space daemon that relies on underlying the local file system to manage the actual storage.
  • Clients — talk to both the MDS and CSS. MooseFS clients mount the file system into user-space via FUSE.

Features

In order to achieve high reliability and performance MooseFS offers following features:

  • Fault-tolerance — MooseFS uses replication, data can be replicated across chunkservers, the replication ratio (N) is set per file/directory. If (N-1) replicas fail the data will still be available. At the moment MooseFS does not offer any other technique for fault-tolerance, like redundancy via network RAID. Fault-tolerance for very big files thus requires vast amount of space - N*filesize instead of filesize+(N*stripesize) as would be the case for RAID 4, RAID 5 or RAID 6.
  • Striping — large files are divided into chunks (up to 64 megabytes) that might be stored on different chunk servers in order to achieve higher aggregate bandwidth.
  • Load balancing — MooseFS attempts to use storage resources equally, the current algorithm seems to take into the account only the consumed space.

Hardware, software and networking

Similarly to other cluster-based file systems MooseFS servers does not require anything more than just commodity hardware running POSIX compliant operating system. TCP/IP is used as an interconnect.

See also

References

External links