Penguin
Diff: FileAllocationTable
EditPageHistoryDiffInfoLikePages

Differences between version 10 and predecessor to the previous major change of FileAllocationTable.

Other diffs: Previous Revision, Previous Author, or view the Annotated Edit History

Newer page: version 10 Last edited on Wednesday, February 11, 2004 1:28:20 pm by AristotlePagaltzis Revert
Older page: version 6 Last edited on Tuesday, August 19, 2003 1:51:11 pm by AristotlePagaltzis Revert
@@ -1,11 +1,71 @@
-A FileAllocationTable ([FAT]) is a map maintained by the FileSystem on a HardDisk to keep track of which [DiskCluster]s a file has been stored in. If it spans several of them, they need not be sequential, but rather may be widely scattered over the disk. 
+A FileAllocationTable ([FAT]) is a map maintained by the FileSystem on a HardDisk to keep track of which [DiskCluster]s a file has been stored in. If it spans several of them, they need not be sequential, but rather may be widely scattered over the disk as the situation requires
  
-The [FAT] is organized as an array as large as there are clusters on the disk, each entry in the array storing the cluster number of the next piece of a file. This way, the FileSystem driver can "hop" from cluster to cluster, collecting the pieces of the file
+It is also a synonym for the FileSystem family that uses it
  
-While simple in concept, a [FAT ] is unreliable . A single bit error may lead to inconsistencies with several files - they may become crosslinked, with one of the chains pointing to cluster in the other chain. Since allocation of disk space happens in the same step as assigning it to a file , a crash can easily lead to unduly or incompletely allocated space . A crash during rewriting or deleting a file can lead to the a part of the existing chain persisting , without any file pointing to it , thus leaving it inaccessible
+This is the only type of FileSystem supported by [MS-DOS ] and the [MS-DOS] based [Windows] versions . It went through several iterations as typical capacity of storage media increased, each named after the bit size of an array entry in the [FAT]: [FAT12] , [FAT16], [FAT32] . It has several design flaws that have a severe negative impact on performance and , worse , reliability
  
-[FAT] is the only FileSystem family supported by [MS-DOS ] and the [MS-DOS ] based [Windows ] versions . The first variant was [FAT12 ], superseeded by [FAT16 ] when [HardDisk ]s become wide spread , finally superseeded by [FAT32] when [HardDisk]s grew to commonly be larger than 2GB
+[FAT] has been deprecated by MicrosoftCorporation. Since it has no provisions to store per-user permissions, [NTFS ] has always been the default for the [Windows ] NT line of [OperatingSystem ]s . Now that the NT descendant [Windows ] 2000 has obsoleted the [MS-DOS ] based [Windows ] flavours , everyone is recommended to use that
  
-[Microsoft] tried to mitigate these problems by storing a backup version of the table on the partition, but only managed to exacerbate the problem by requiring a fragile procedure to succeed multiple times. In addition, there is no indication of which copy is authoritative, when they differ.  
+!!! Structure of the FileAllocationTable  
  
-Better FileSystems manage disk space using a separate bitmap for allocation of clusters and [Inode]s for assigning them. The latest in advances are attempts replace all of these with [BTree]s. 
+The [FAT] is organized as an array as large as there are clusters on the disk, each entry in the array storing the number of a DiskCluster which contains the next piece of a file. This way, the FileSystem driver can "hop" from cluster to cluster, collecting the pieces of the file. In the original 12-bit version, [FAT12], each entry can have one of the following values:  
+  
+|^__Value__|^__Meaning__  
+|000 | The cluster is unused and available  
+|001 - FF6 | The cluster number of the next cluster in the file.  
+|FF7 | The cluster has a bad sector in it and will not be allocated  
+|FF8 - FFF | The last claster of a file  
+  
+See [FAT12], [FAT16], and [FAT32] for (some) details on the changes during the evolution of [FAT].  
+  
+!!! Filenames under [FAT]  
+  
+Until late in its history, a directory entry in [FAT] consisted of 8 characters for the name of the file and 3 characters for its "extension".  
+  
+In the [MS-DOS] 1.0 design of [FAT12], there was no support for subdirectories. You could only store files in the root directory of a drive, which had space for 256 entries. The common media at that time were 180Kb floppies.  
+  
+In order to better support [HardDisk]s, support for a hierarchical directory structure was hacked into [MS-DOS] 2.0 along with the addition of [FAT16]. For a long time afterwards, nothing changed.  
+  
+At long last, with the introduction of [Windows] 95, MicrosoftCorporation added another ugly hack to [FAT16] that allowed multiple directory entries to be considered as one. In this flavour, directory entries that have a previously illegal combination of flags set are considered to belong to the immediately previous valid entry, and contain its LFN, ie "long filename", piecemeal as a series of [Unicode] UCS-2 characters. Since directory entries are 32 bytes long and long filenames may be up to 256 characters long, several of these LFN entries may exist per file. The directory entry they belong to contains its file's so-called short name.  
+  
+Around that time, [HardDisk]s started to reach the limits of the [FAT16] storage capacity. The final version of this FileSystem, [FAT32] was hacked into [Windows] 95B. The [Windows] NT line did not have support for this new [FAT] derivative until [Windows] 2000 came about.  
+  
+!!! Problems  
+  
+!! Insufficient redundancy  
+  
+A [FAT] is unreliable. A single bit error may lead to inconsistencies with several files - they may become crosslinked, with one of the chains pointing to cluster in the other chain. Since allocation of disk space happens in the same step as assigning it to a file, a crash can easily lead to unduly or incompletely allocated space. A crash during rewriting or deleting a file can lead to the a part of the existing chain persisting, without any file pointing to it, thus leaving it inaccessible. [Microsoft] tried to mitigate these problems by storing a backup version of the table on the partition, but only managed to exacerbate the problem by requiring a fragile procedure to succeed multiple times. In addition, there is no indication of which copy is authoritative, when they differ.  
+  
+ Better FileSystems manage disk space using a separate bitmap for allocation of clusters and [Inode]s for assigning them. The latest in advances are attempts to replace all of these with [BTree]s, as in [ReiserFS] .  
+  
+!! [FAT] location  
+  
+The entire FileAllocationTable is located en bloc at the beginning of the disk. This means that any access to file MetaData requires moving the HardDisk heads a potentially far distance away. It would have been a better choice to at least place the table in the center of the disk.  
+  
+Better FileSystems spread their MetaData around the disk, so it is relatively close to the files it belongs to.  
+  
+!! Storage Allocation  
+  
+[FAT] uses a FirstFit method to allocate [DiskCluster]s to files. According to the "MS-DOS Operating System Programmer's Reference Manual" (for DOS 2.), "this permits the most efficient utilization[[sic] of disk space because clusters made available by erasing files can be allocated for new files." Unfortunately, this also causes severe file fragmentation. To combat fragmentation, defragmentation programs were written - first by third party vendors, but later also by MicrosoftCorporation themselves. These move [DiskCluster]s around so that all files end up in one piece , one after the other, at the beginning of the disk. Unfortunately, FirstFit strategy means that any grown file's new clusters will now end up behind the mass of allocated ones, causing fragmentation to affect performance even more gravely.  
+  
+Better FileSystems avoid these problems by spreading files around, so there is space left "behind" a file for when it has to be grown. Some also store small files right inside their MetaData structures. The latest in advances are attempts to keep both MetaData and the files themselves in a [BTree], as in [ReiserFS], thus ensuring that related pieces of data end up close to each other.  
+  
+!! Long filename support  
+  
+This feature's late addition had multiple unpleasant effects:  
+  
+# When iterating over the list of files in a directory, both the short and long name of a file are reported, separately, so care must be taken to avoid processing files twice. %%% %%%  
+# [MS-DOS] 7.0 was merely hacked to keep the LFN entries intact, but does not use them. Only the native 32-bit [FAT] driver used by [Windows] contains code to manipulate them, and provides [API]s to do so. Therefor, [DOS] based programs cannot even be aware of the existence of LFNs unless running in a [DOS] box under [Windows]. %%% %%%  
+# A lot of system maintenance utilities from before the release of [Windows] 95 considered the new LFN entries to be invalid crud as per the previous spec and either refused to work, or, worse, removed them. Since a lot of the system configuration of a [Windows] 95 installation refers to files by their long names, it won't even boot further than [MS-DOS] without them. A lot of money was made from selling tool upgrades around that time.  
+  
+!!! Tips  
+  
+Use a disk [Cache]. Even a tiny [Cache] of a few hundred Kb will often have a dramatic effect when using a [FAT] FileSystem, due to a synergy of effects of its flaws:  
+  
+# The MetaData tightly packed in a single block, which makes it an ideal candidate for caching.  
+# It is accessed very frequently, which keeps it in cache.  
+# It is often far away from the data itself, so serving it from the [Cache] saves ''huge'' amounts of HardDisk head movements.  
+  
+Beware though: running the [Cache] in WriteBack mode will further jeopardize the reliability of the already fragile [FAT]. Your best bet is to stick to WriteThrough.  
+  
+Also, __never__ leave the disk [Cache] active while running system maintenance utilities such as a