A file system is the means to locate information stored on a computer system. Files as we know are stored on media (disks, tapes, cards and solid state memory) of various kinds and depending on the type of that media, it would involve various paradigms of access- sectors, tracks and grid points. The file system quite simply put, tells the software where a particular piece of information can be found, a sort of a GPS on that media if you like. These things have been around since the beginning of storage capable computers and have been continuously evolving. And through this time, a number of different types and variations of the same method of storage have arisen. In our usual lives, we come across and document only a tiny fraction of all the file systems that exist (see the box for a bigger list).
While the modern day versions of file systems we already knew about-like the NTFS or EXT3-are not quite so significantly different in the number of new features it has over its own older editions, they do have a number of value adds. For instance, ext3 added the ability to create read/write snapshots that improves data -recovery prospects in the event of a crash. In this article, we'll look at some of such enhancements, as well as a look at some new kids on the block.
|
Ext3
For users of RedHat, Fedora, PCQ Linux and Debian/GNU Linux, this file system holds little surprise as it is the default on their systems. Ext3 is a 'journaling file system' (JFS). A JFS is something that logs changes made to the data separately before actually committing the change to the file or the structure. This means such changes can be tracked or rolled back, just like in a database. Such logging can be either metadata-only (where only the actions are logged) or comprehensive (where even the data is logged). As you can guess, metadata-only journals are faster in performance and have lighter overheads and this is what Ext3 has.
New in Ext3 is also its rather large addressability that has swelled to 8 TB. In quick comparison, NTFS for Win 2003 sees a 16 TB maximum. Ext3 now allows for 'block reservation' where portions of storage can be reserved in advance, improving data organization on a disk, resulting in faster access. Previous editions of Ext3 did not support SATA disks and this problem is solved in the kernel 2.6 version of the file system.
GmailFS
Gmail is an e-mail service from Google and we have said lots about this service in our past issues. But wait a minute, we suffixed that with “FS”? Yes, for sometime now, programmers have been dying to tap into the luxurious 2 GB plus space, Gmail is offering for uses other than purely e-mail. One such brilliant solution is the GmailFS from a Richard Jones
(http://richard.jones.name/). Richard has not only managed to find out what's under the hood of the Gmail system, but also how to use it in novel ways, including installing a copy of Linux on it. He says it is just a mountable Linux file system. He has plugged the FUSE
(Filesystem in User Space, http://fuse.sourceforge.net
) project with some Python code and used a custom library 'libgmail' to communicate with Gmail itself. GmailFS allows you to perform normal file system operations on your Gmail account, sans your regular e-mail which will remain invisible in this interface. In Windows, you can also mount this as a special drive and synchronize files-a very good alternative to avoid carrying around those pesky and low-capacity USB drives provided you have the bandwidth.
Global FS (GFS)
|
Another RedHat contribution, this file system is meant for clustering servers in an enterprise. It allows you to combine storage space across networked storage like SAN and NAS boxes and use it as one. An administrative console lets the administrator configure and manage this space. RedHat insists that using GFS you can add or remove entire servers to this cluster simply by mounting and unmounting them as with regular disk partitions. However, this is not a native feature of the Linux system and you need to acquire the 'Red Hat Cluster Suite' to use it. The GFS system is supported on various 64-bit platforms as well and supports up to 256 nodes. Where would you use such a thing? GFS can be used in data centers, high-end Web servers and high-performance computing clusters.
GNOME storage
Close on the heels of the WinFS system in LongHorn, people at GNOME are working on a similar project called 'GNOME Storage'
(http://www.gnome.org/~seth/storage). This is supposed to work in a human-friendly fashion and help locate our files even faster. According to a document on that website, 'We think of objects similarly to 'My roommate's desk' not 'earth.us.stanford. dorm5.room109.desk2', and it makes sense to enable access in this manner rather than the conventional filesystem hierarchy.'. Well, this was echoed earlier in the earliest drafts from Microsoft about the WinFS. Another eerie echo is the statement that files will be accessible from anywhere using SQL99 commands.
Of course, Seth says, this system will have backwards compatibility to work with existing applications (through 'GnomeVFS', a virtualization layer that pretends things are still Ext3 or whatever). The new schema will be exposed to new applications in an object-oriented get/put fashion. Graphical innovations planned are simple sliders in file explorers that let you drag a point to see all historical versions of that document. Well, what do we say? It all appears very interesting, including the delicious-looking noodles on one of the FAQs Seth has written about it.
WinFS
This was supposed to be out by now! We have also covered it in an earlier article (Getting ready for Longhorn, January 2004). But it appears now that a lot of features they wanted to put in at that time were to say the least, very difficult to implement in reality. Or maybe things worked so well they felt there wouldn't be enough vulnerabilities to supply patches for later.
WinFS from what scant documentation exists, we gather will have NTFS as a base file system. Then we would have an SQL Server 2005 managing indexing system that would help locate data using various attributes. Why SQL Server? Because you can use SQL commands to access the information it has. Besides, a rudimentary edition of SQL Server's desktop cousin-MSDE-already runs our NTFS storage since Windows 2000. Microsoft plans to add a huge quantity of metadata to describe a 'file' under WinFS. Third-party developers would be free to come up with their own such schema and extend the metadata. WinFS is supposed to automatically extend its indexing to all the new information as well. More on WinFS at
http://msdn.microsoft.com/data/winf
Embedded file systems
Actually, that is not the name of the file system but just the category. There are two of interesting proportions-Solid FS and Reliance FS. The second one has nothing to do with the Indian multi-industry enterprise of the same name. Solid FS is a system from a vendor called 'Eldos' and features encryption and compression. The entire Solid FS code itself fits into 100 KB, which proves that this was an FS engine written with a low-capacity embedded platform in mind. Solid FS is also a royalty free engine and can be readily used with almost any manner of the Win (NT/2000/XP/2003), UNIX and Palm OS.
Encryption is offered on both a per-stream (flow of data) and per-file basis in any one of AES, SHA or custom 256-bit systems. Journaling and self-repair capability is also part of the system. This is also perhaps the only FS to offer on-the-fly resizability (a requirement on the embedded) and defragmentation. Strangely, for its market, Solid FS claims to be able to scale upto 128 TB per file and 256 TB on the whole. We would imagine a 256 TB storing embedded system to be too far off in the future to even contemplate. More at
http://www.eldos.com/solfs. Reliance FS is a 32-bit only file system that uses a transactional system (see Ext3 discussion above).
According to its developer (datalight. com), Reliance FS is as powerful as the FAT file system and uses a proprietary data storage format. This however means that it does not have a journaling system like Solid FS. Reliance also works with SCSI and ATA devices besides solid-state storage, requiring also about 100 KB of RAM to run. Each disk can be between 64 KB to 2 TB in size.
Sujay V. Sarma