From this month, we’re starting a series on technologies introduced or enhanced in Win 2K/XP. Also on offer are tips ‘n’ tricks to help you use your system to its fullest potential. Let’s now take a look at the native file system of Win2K/XP, the
NTFS.
A brief history
During the days of DOS (Disk Operating System), there used to be only one file system–FAT, an acronym for File Allocation Table or the way file entries are stored on the disk.
FAT carried on into the world of Win 3.1 and even Win 95. However, with Win NT, a new and more robust file system called NTFS (New Technology File System) was introduced. Almost immediately after that, Win 95 OSR2 and Win 98 introduced FAT32–a more efficient FAT file system for large hard disks. Surprisingly, Win NT 4 couldn’t read Win 98 FAT32 and vice-versa.
Win 2K released NTFS 5.0 with new enhancements to it. And Win XP goes a step further to give us NTFS 5.1.
What’s new?
Compared to Win NT 4.0, Win 2K made a lot of enhancements to the NTFS. The main ones were encryption, quotas, mounting, sparse files, compression and hard links. Plus, NTFS is also more robust than FAT because of its transaction log mechanisms that enable recoverability in case of failure too.
Master File Table
The MFT (Master File Table) is the equivalent of the File Allocation Table in the FAT file system. This is the part of the file system that contains information about the files and directories stored on the disk. The MFT is nothing but a file itself and is actually a relational database containing files as records and their attributes as the columns in the database.
When a partition is formatted as NTFS, space is allocated for the MFT (called the MFT Zone) and some ‘metadata files’ (which we’ll take a quick look at shortly). The MFT Zone is by default allocated 12.5% of the disk space on the volume. This is to prevent fragmentation of the MFT, which can cause file retrievals to become slow as well as other errors if not rectified. The MFT zone is never used to store data unless the rest of the disk becomes full. This happens in two scenarios–One, when there are a large number of small files, in which case the MFT fills up the zone by having to store each file entry, or two, when there are a very huge files that fills up the unreserved space quickly.
The MFT does not work alone. It uses a set of ‘metadata’ files too that are used to describe the MFT. Since these and the MFT itself are also files, information about them is also stored in the MFT! The first 16 records of the MFT store data about the MFT file and the metadata files.
Name | File | Record | Description |
Master File Table |
$Mft | 0 | The actual MFT file |
MFT Mirror |
$MftMirr | 1 | A mirror of the MFT in case of MFT failure |
Log | $Logfile | 2 | The transaction log that NTFS can use for restorability |
Volume | $Volume | 3 | Volume label and version info |
Attributes | $AttrDef | 4 | List of attributes and descriptions |
Root File |
. | 5 | Root Folder |
Cluster Bitmap |
$Bitmap | 6 | Volume free and unused clusters |
Boot Sector |
$Boot | 7 | The boot sector pointer as well as any extra |
Bad Clusters |
$BadClus | 8 | List of bad clusters |
Security | $secure | 9 | Security Descriptors for every file |
UpCase chars |
$Upcase | 10 | Lowercase — Uppercase conversion table (Unicode) |
NTFS Extensions |
$Extend | 11 | Optional extensions like quotas, reparse points, etc, stored here |
12-15 | Reserved but unused |
The locations of the MFT file and its backup, the MFT Mirror file, are stored in the boot sector. If NTFS is unable to read (or make sense of) the MFT, it reads the MFT mirror and recreates the MFT from this backup. In Win XP, the transaction logfile and the volume bitmap have a different location than that of Win 2K, which is meant to yield a performance benefit of about 8%. So if you’re upgrading from Win 2K to Win XP, it makes sense to recreate and reformat your partition to the new NTFS structure to get the advantage.
Hard links
Hard links are a new feature in Win XP. They are directory entries for files on your system that do not duplicate the file itself. If you know Unix/Linux, you know exactly what these are. Unlike shortcuts, which are just new, separate files that ‘redirect’ to the file that it points to, hard links have no physical file associated with them–only a directory
entry.
You can use hard links for a file in a directory but having a different name or for a file existing in one folder, but a hard link in another. Hard links can be used to present different views or an easy to reach way of accessing a particular file. To create a hard link on an Win XP machine, use the following command:
fsutil hardlink create
You can now use the new filename as if it were the original itself. There are a few things to remember however. When you modify a file after opening it using the hardlink, the properties (file size, date modified, etc.) for the original do not get updated. However, if you modify the file using the original filename, all hardlinks to the file show the updated status. Deleting a hard link does not delete the original file, whereas deleting the original will also remove all associated
hardlinks.
Sparse files
This was introduced in Win 2K and allows you to store a file that requires a large amount of space but utilizes a much lesser amount in a more efficient manner than other file systems.
To make things clearer, imagine a download utility like Getright or Morpheus. Both these programs allow you to download files from the Net. And both allow multi-part downloads, which means that they have threads that connect to different servers and download different parts of the file and store it into the same file on your hard disk.
Now if you are downloading a large installation file–say a new OS or a CD ISO image, which typically take up hundreds of MB of disk space–the programs allocate a file that is of the final size beforehand, so that they can write to different sections of it. Take a look at the diagram below to understand what I mean.
The diagram shows a file that takes up 1GB on a hard disk. The space marked as ‘used’ is the only data that has been written by the application. In fact, if you’ve the ‘Segmented Download’ feature in GetRight, you’d have seen this sort of progress bar a lot of times. Unfortunately, if you’re using an older file system like FAT/FAT32 or NTFS 4.0, the space for the file on the hard disk would be fully allocated. That is, you’d lose 1GB immediately from the drive.
Using sparse files, you can reduce the amount of disk space used to only that, which actually has data in it. By using the ‘fsutil sparse’ set of commands, you can mark a file to be sparse. The modified diagram for a sparse file of 1GB looks like the one below.
Here although the file size is supposed to be 1GB, only the space that contains data is actually allocated. The rest of that space is free for the file system to use.
Optimizing NTFS
Although NTFS is way more reliable and performance oriented than FAT ever was, it still has its shortcomings and must be maintained carefully.
Fragmentation is one of the most troublesome things that can happen on your system. It slows down file access, produces more wear on your drive and is generally irritating when opening a file causes disk thrashing like mad. Win XP’s built-in defrag is ok to do the job. But for serious defragging get a utility like Norton Speedisk or Executive Software’s DiskKeeper (Windows defrag is a stripped down version of this). Win 2K and above also allow boot-time defragmentation that lets defragmenters access and move files that they can’t from within Windows (say the MFT and LogFiles mentioned above). So ensure that you defrag regularly on your NTFS volumes.
Short file names are the 8.3 version of names that are seen by DOS users. Every file created on the NTFS system has an associated directory entry in the 8.3 format. If you’ve a large number of files on your system, not only do the directory entries take up space in the MFT (and hence fragment it), but can also cause delays in file creation if the system has to autogenerate 8.3 filenames for many files having similar names. Win XP now allows you to turn off 8.3 filename generation. Use the command fsutil behavior set disable8dot3 1, to disable this.
Run ChkDsk or any other filesystem scanning utility regularly to check for file system errors and correct them. Errors can occur due to unexpected shutdowns, crashed applications etc. And although they may not show up in the folder structure, they take up space on the disk as well as are capable of causing more serious problems.
Vinod Unny is a Technology Consultant at iSquare Technologies