Penguin

Differences between version 7 and predecessor to the previous major change of DiskCluster.

Other diffs: Previous Revision, Previous Author, or view the Annotated Edit History

Newer page: version 7 Last edited on Monday, October 10, 2005 5:40:18 pm by GraemePietersz Revert
Older page: version 4 Last edited on Saturday, November 16, 2002 8:10:42 pm by PerryLorier Revert
@@ -1,4 +1,28 @@
-Managing all the bytes on a disk has a lot of overhead. A most disks internally uses 512 byte "Sectors" to store data. So you can manage data by 512 byte blocks (ie the BlockSize is 512 bytes) . Now , some FileSystems manage data in "DiskCluster'''' s" which is some multiple of the BlockSize . For [Ext2]/[Ext3] for instance, this is by default 4k[1] (although this can be changed) . Microsoft took this to all new hights by having huge block sizes, up to 64k in size .  
-  
-  
- [1]: 8 disk blocks  
+Since  
+ disks internally use 512 byte sectors to store data anyway, most  
+FileSystems use some multiple of this size (typically 2,048, 4,096, or  
+8,192 bytes) as the smallest unit to store a file . This means that  
+regardless of its size, a file will always occupy the next largest  
+multiple of the cluster size it can be fit into. Obviously, file sizes  
+aren't often exact multiples of the cluster size, and the larger the  
+cluster size is, the more space goes wasted. At worst, an entire  
+cluster may be allocated to store a single byte.  
+It may seem that this would make it desirable to choose cluster size as  
+small as possible, but it is not so. The smaller the clusters are, the  
+more of them there are on a disk. This means you need to store much  
+more MetaData to keep track of their use. With modern [HardDisk]s  
+having hundreds of gigabytes of space, it can easily mean having to  
+keep track of hundreds of megabytes of MetaData to organize their use.  
+This can have a dramatic impact on performance, since reading and  
+writing a lot of data also means reading or writing a lot of MetaData,  
+which is typically situated at least a small ways apart from the data .  
+Throughout the years , numerous attempts have tried to increase  
+proximity of data and MetaData to combat the effects of the drastic  
+increase in MetaData. The latest attempt by modern FileSystems is by  
+using [BTree] s, which seems to work very well, at the unfortunate cost  
+of highly increased fragility of the MetaData structures .  
+The default cluster size for [Ext2]/[Ext3] is 4096 bytes (ie 8  
+sectors), but may be changed at FileSystem creation time . [ Microsoft]  
+tried to overcome inadequacies of their [FAT] FileSystems by using huge  
+cluster sizes of up to 64k.  
+The [dumpe2fs( 8)] command will display the block/cluster size of an [Ext2]/[Ext3] (along with a lot of other information).