A quick glance about filesystems…

Cluster vs. Distributed Filesystems

Cluster filesystems and distributed filesystems are two different things, but sometimes they’re mixed together, so it’s time to shed some light on that…

Normal Filesystem (1-to-1)

ext2, ext3, FAT[16,32], NTFS, are all normal filesystem where one OS kernel interfaces with one filesystem on top of one(or more) logical volume (let’s call it a block device)…

FS - normal FS

How it works

  • One system is connected to the block device
  • One kernel accesses handles physical I/O
  • One kernel accesses the filesystem at any point in time
  • Filesystem cache is local to the OS kernel
  • Filesystem cache is maintained by the OS kernel

Shared Filesystems

Are network-based, allows one or more hosts to remotely access files that exist on another host, and mount them to local folders …

NFS/Samba CIFS are two big examples for this…

  • One host OS exports local files/folders to other nodes on the network
  • One or more client OS(s) access the exported files and mount them locally


How it works

  • One system is connected to the block device
  • One kernel handles physical I/O (host OS)
  • One kernel accesses the filesystem (on disk) at any point in time
  • Multiple kernels access the exported files via network
  • Filesystem cache is local to the host OS kernel
  • Filesystem cache is maintained by the host OS kernel

Ooops, Problems !!!!

The shared filesystem model introduced several issues from

  • I/O bottle necks
  • network latency
  • the host OS is a single point of failure

It was needed to make different OS kernels to access the same physical device, but this would result in different situations :

  • If I/O cache would be used, then each kernel will have his own point of view of the filesystem, and both kernels will eventually corrupt the filesystem on disk
  • If I/O cache is disabled, then filesystem will remain intact, BUT this will be a huge performance hit
  • If only one kernel is allowed to access the filesystem, then would only by useful in HA (high availability clusters), still no load balancing

Direct Attached Storage – HA Cluster

Here both systems are directly attached to the block device, and both systems can perform I/O on it…

FS - normal - HA

Using a normal filesystem, but to keep the filesystem intact, only one OS will be allowed to mount the filesystem at any given time…

Cluster Filesystems

Overcome the cache in-consistency issue across different kernel by the having cache synchronization/flushing mechanisms in place (but this isn’t the place to discuss how they work internally) …

To do so, cluster filesystems have unique design, and implementations allowing this kind of concurrency …

All nodes, have direct access to the physical volume, and all nodes can mount the filesystem at once, and all of them can perform I/O on the filesystem with cache enabled…

Examples of Cluster Filesystems; GFS (Redhat – Open Source), OCFS (Oracle – Open Source), Veritas (Veritas – Commercial license) …

FS - Cluster FS

How it works

  • All systems have access to the same block device
  • Multiple kernels handle physical I/O
  • Multiple kernels accesse the filesystem (on disk) at any point in time
  • Filesystem cache is local to each OS kernel
  • Filesystem cache is synchronized across all systems accessing the FS

The main problem with cluster filesystems; its requirement for all nodes to connect to the same block device, this can be done either by means of SAN network (which can be very expensive), or by means of iSCSI (which requires some effort and network device support to achieve the best performance from an iSCSI network)

Distributed Filesystems

It’s when a file system is spread over several network nodes, and accessed by multiple client nodes; from the client’s point of view, it’s one huge large filesystem …

The beauty of distributed filesystems is that it can achieve very large capacities, very high I/O throughput, along with a very high level of redundancy using commodity cheap hardware, but this is at the expense of system complexity and maintainability …

FS - Distributed FS

Examples of distributed filesystems; GFS (Google file system), LUSTRE (SUN – Open Source), PVFS2 (Open Source Project) …

How it works

  • multiple host nodes, each run its own kernel, each has only access to its own block device(s)
  • One or more master nodes, responsible for request dispatching, and systems status, and syncronization
  • multiple clients to the distributed filesystems
  • A client mounts the logical filesystem locally
  • Access requests are served by different nodes based on the topology defined by the master node(s)

Note, There two GFS mentioned above, even though the name is alike, they’re very different, GFS (Global File System – Redhat – Open Source with commercial support) and GFS (Google File System – Closed Source – Internal to Google operations), the later isn’t available for any kind of users, it’s purely developed and used by Google alone…


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: