Advertisment

File Systems

author-image
PCQ Bureau
New Update

A file system is the means to locate information stored on a computer system. Files as we know are stored on media (disks, tapes, cards and solid state memory) of various kinds and depending on the type of that media, it would involve various paradigms of access- sectors, tracks and grid points. The file system quite simply put, tells the software where a particular piece of information can be found, a sort of a GPS on that media if you like. These things have been around since the beginning of storage capable computers and have been continuously evolving. And through this time, a number of different types and variations of the same method of storage have arisen. In our usual lives, we come across and document only a tiny fraction of all the file systems that exist (see the box for a bigger list).



While the modern day versions of file systems we already knew about-like the NTFS or EXT3-are not quite so significantly different in the number of new features it has over its own older editions, they do have a number of value adds. For instance, ext3 added the ability to create read/write snapshots that improves data -recovery prospects in the event of a crash. In this article, we'll look at some of such enhancements, as well as a look at some new kids on the block.

Advertisment
Direct

Hit!

Applies to: Everyone

USP: Insights into the inner workings of the latest file systems 

Primary Link:

http://en.wikipedia.org/wiki/File_system   

Google keywords:

modern file systems 

Ext3



For users of RedHat, Fedora, PCQ Linux and Debian/GNU Linux, this file system holds little surprise as it is the default on their systems. Ext3 is a 'journaling file system' (JFS). A JFS is something that logs changes made to the data separately before actually committing the change to the file or the structure. This means such changes can be tracked or rolled back, just like in a database. Such logging can be either metadata-only (where only the actions are logged) or comprehensive (where even the data is logged). As you can guess, metadata-only journals are faster in performance and have lighter overheads and this is what Ext3 has.



New in Ext3 is also its rather large addressability that has swelled to 8 TB. In quick comparison, NTFS for Win 2003 sees a 16 TB maximum. Ext3 now allows for 'block reservation' where portions of storage can be reserved in advance, improving data organization on a disk, resulting in faster access. Previous editions of Ext3 did not support SATA disks and this problem is solved in the kernel 2.6 version of the file system.

GmailFS



Gmail is an e-mail service from Google and we have said lots about this service in our past issues. But wait a minute, we suffixed that with “FS”? Yes, for sometime now, programmers have been dying to tap into the luxurious 2 GB plus space, Gmail is offering for uses other than purely e-mail. One such brilliant solution is the GmailFS from a Richard Jones

(http://richard.jones.name/). Richard has not only managed to find out what's under the hood of the Gmail system, but also how to use it in novel ways, including installing a copy of Linux on it. He says it is just a mountable Linux file system. He has plugged the FUSE

(Filesystem in User Space, http://fuse.sourceforge.net

) project with some Python code and used a custom library 'libgmail' to communicate with Gmail itself. GmailFS allows you to perform normal file system operations on your Gmail account, sans your regular e-mail which will remain invisible in this interface. In Windows, you can also mount this as a special drive and synchronize files-a very good alternative to avoid carrying around those pesky and low-capacity USB drives provided you have the bandwidth.



Global FS (GFS)

Advertisment
Uncommon file systems

ODS-5 or Files-11 is used by HP's OpenVMS



CIFS or Common Internet File System, is the NetBIOS system that enables file sharing systems (like Samba) to communicate.


Lustre is an open source clustering file system that need to address peta-bytes of storage


BeFS is a file system for the BeOS and is fully 64-bit capable and journaling ideal for use on hard disks or CDROMs — has a high overhead of 2MB rendering it useless on small disks like floppies


HFS+ the OS X version of HPFS is 64-bit and uses a 32-bit mapping table and B-Trees making way for faster searches and higher storage per file


Another RedHat contribution, this file system is meant for clustering servers in an enterprise. It allows you to combine storage space across networked storage like SAN and NAS boxes and use it as one. An administrative console lets the administrator configure and manage this space. RedHat insists that using GFS you can add or remove entire servers to this cluster simply by mounting and unmounting them as with regular disk partitions. However, this is not a native feature of the Linux system and you need to acquire the 'Red Hat Cluster Suite' to use it. The GFS system is supported on various 64-bit platforms as well and supports up to 256 nodes. Where would you use such a thing? GFS can be used in data centers, high-end Web servers and high-performance computing clusters.

GNOME storage



Close on the heels of the WinFS system in LongHorn, people at GNOME are working on a similar project called 'GNOME Storage'

(http://www.gnome.org/~seth/storage). This is supposed to work in a human-friendly fashion and help locate our files even faster. According to a document on that website, 'We think of objects similarly to 'My roommate's desk' not 'earth.us.stanford. dorm5.room109.desk2', and it makes sense to enable access in this manner rather than the conventional filesystem hierarchy.'. Well, this was echoed earlier in the earliest drafts from Microsoft about the WinFS. Another eerie echo is the statement that files will be accessible from anywhere using SQL99 commands.

Advertisment

Of course, Seth says, this system will have backwards compatibility to work with existing applications (through 'GnomeVFS', a virtualization layer that pretends things are still Ext3 or whatever). The new schema will be exposed to new applications in an object-oriented get/put fashion. Graphical innovations planned are simple sliders in file explorers that let you drag a point to see all historical versions of that document. Well, what do we say? It all appears very interesting, including the delicious-looking noodles on one of the FAQs Seth has written about it. 

WinFS



This was supposed to be out by now! We have also covered it in an earlier article (Getting ready for Longhorn, January 2004). But it appears now that a lot of features they wanted to put in at that time were to say the least, very difficult to implement in reality. Or maybe things worked so well they felt there wouldn't be enough vulnerabilities to supply patches for later. 

WinFS from what scant documentation exists, we gather will have NTFS as a base file system. Then we would have an SQL Server 2005 managing indexing system that would help locate data using various attributes. Why SQL Server? Because you can use SQL commands to access the information it has. Besides, a rudimentary edition of SQL Server's desktop cousin-MSDE-already runs our NTFS storage since Windows 2000. Microsoft plans to add a huge quantity of metadata to describe a 'file' under WinFS. Third-party developers would be free to come up with their own such schema and extend the metadata. WinFS is supposed to automatically extend its indexing to all the new information as well. More on WinFS at

http://msdn.microsoft.com/data/winf 

Advertisment

Embedded file systems



Actually, that is not the name of the file system but just the category. There are two of interesting proportions-Solid FS and Reliance FS. The second one has nothing to do with the Indian multi-industry enterprise of the same name. Solid FS is a system from a vendor called 'Eldos' and features encryption and compression. The entire Solid FS code itself fits into 100 KB, which proves that this was an FS engine written with a low-capacity embedded platform in mind. Solid FS is also a royalty free engine and can be readily used with almost any manner of the Win (NT/2000/XP/2003), UNIX and Palm OS. 

Encryption is offered on both a per-stream (flow of data) and per-file basis in any one of AES, SHA or custom 256-bit systems. Journaling and self-repair capability is also part of the system. This is also perhaps the only FS to offer on-the-fly resizability (a requirement on the embedded) and defragmentation. Strangely, for its market, Solid FS claims to be able to scale upto 128 TB per file and 256 TB on the whole. We would imagine a 256 TB storing embedded system to be too far off in the future to even contemplate. More at

http://www.eldos.com/solfs. Reliance FS is a 32-bit only file system that uses a transactional system (see Ext3 discussion above). 

According to its developer (datalight. com), Reliance FS is as powerful as the FAT file system and uses a proprietary data storage format. This however means that it does not have a journaling system like Solid FS. Reliance also works with SCSI and ATA devices besides solid-state storage, requiring also about 100 KB of RAM to run. Each disk can be between 64 KB to 2 TB in size.

Sujay V. Sarma

Advertisment