Blame: dictunzip(1) - Waikato Linux Users Group

Annotated edit history of dictunzip(1) version 1, including all changes. View license author blame.

Rev	Author	#	Line
1	perry	1	`DICTZIP`
		2	`!!!DICTZIP`
		3	`NAME`
		4	`SYNOPSIS`
		5	`DESCRIPTION`
		6	`TRADEOFFS`
		7	`OPTIONS`
		8	`CREDITS`
		9	`SEE ALSO`
		10	`----`
		11	`!!NAME`
		12
		13
		14	`dictzip, dictunzip, dictzcat - compress (or expand) files, allowing random access`
		15	`!!SYNOPSIS`
		16
		17
		18	`__dictzip [[__''options''__]__ ''name`
		19	`''__dictunzip [[__''options''__]__ ''name`
		20	`''__dictzcat__ ''name`
		21	`''`
		22	`!!DESCRIPTION`
		23
		24
		25	`__dictzip__ compresses files using the gzip(1)`
		26	`algorithm (LZ77) in a manner which is completely compatible`
		27	`with the __gzip__ file format. An extension to the`
		28	`__gzip__ file format (Extra Field, described in 2.3.1.1`
		29	`of RFC 1952) allows extra data to be stored in the header of`
		30	`a compressed file. Programs like __gzip__ and __zcat__`
		31	`will ignore this extra data. However, dictd(8), the`
		32	`DICT protocol dictionary server will make use of this data`
		33	`to perform pseudo-random access on the file. Files in the`
		34	`__dictzip__ format should end in`
		35	`__gzip__ files that`
		36	`do not contain the special header information.`
		37
		38
		39	`From RFC 1952, the extra field is specified as`
		40	`follows:`
		41
		42
		43	`If the FLG.FEXTRA bit is set, an`
		44
		45
		46	`+---+---+---+---+==================================+`
		47	`\|SI1\|SI2\| LEN \|... LEN bytes of subfield data ...\|`
		48	`+---+---+---+---+==================================+`
		49	`SI1 and SI2 provide a subfield ID, typically two ASCII letters with some mnemonic value. Jean-Loup Gailly`
		50
		51
		52	`LEN gives the length of the subfield data, excluding the 4`
		53	`initial bytes.`
		54
		55
		56	`The __dictzip__ program uses 'R' for SI1, and 'A' for SI2`
		57	`(i.e.,`
		58	`__`
		59
		60
		61	`+---+---+---+---+---+---+===============================+`
		62	`\| VER \| CHLEN \| CHCNT \| ... CHCNT words of data ... \|`
		63	`+---+---+---+---+---+---+===============================+`
		64	`As per RFC 1952, all data is stored least-significant byte first. For VER 1 of the data, all values are 16-bits long (2 bytes), and are unsigned integers.`
		65
		66
		67	`XLEN (which is specified earlier in the header) is a two`
		68	`byte integer, so the extra field can be 0xffff bytes long, 2`
		69	`bytes of which are used for the subfield ID (SI1 and SI1),`
		70	`and 2 bytes of which are used for the subfield length (LEN).`
		71	`This leaves 0xfffb bytes (0x7ffd 2-byte entries or 0x3ffe`
		72	`4-byte entries). Given that the zip output buffer must be`
		73	`10% + 12 bytes larger than the input buffer, we can store`
		74	`58969 bytes per entry, or about 1.8GB if the 2-byte entries`
		75	`are used. If this becomes a limiting factor, another format`
		76	`version can be selected and defined for 4-byte`
		77	`entries.`
		78
		79
		80	`For compression, the file is divided up into`
		81
		82
		83	`To perform random access on the data, the offset and length`
		84	`of the data are provided to library routines. These routines`
		85	`determine the chunk in which the desired data begins, and`
		86	`decompresses that chunk. Consecutive chunks are decompressed`
		87	`as necessary.`
		88	`!!TRADEOFFS`
		89
		90
		91	`__Speed__`
		92
		93
		94	`True random file access is not realized, since any access,`
		95	`even for a single byte, requires that a 64kB chunk be read`
		96	`and decompressed. This is slower than accessing a flat text`
		97	`file, but is much, much faster than performing serial access`
		98	`on a fully compressed file.`
		99
		100
		101	`__Space__`
		102
		103
		104	`For the textual dictionary databases we are working with,`
		105	`the use of 64kB chunks and maximal LZ77 compression realizes`
		106	`a file which is only about 4% larger than the same file`
		107	`compressed all at once.`
		108	`!!OPTIONS`
		109
		110
		111	`__-d__ or __--decompress__`
		112
		113
		114	`Decompress. This is the default if the executable is called`
		115	`__dictunzip__.`
		116
		117
		118	`__-c__ or __--stdout__`
		119
		120
		121	`Write output on standard output; keep original files`
		122	`unchanged. This is only available when decompressing`
		123	`(because parts of the header must be updated after a write`
		124	`when compressing).`
		125
		126
		127	`__-f__ or __--force__`
		128
		129
		130	`Force compression or decompression even if the output file`
		131	`already exists.`
		132
		133
		134	`__-h__ or __--help__`
		135
		136
		137	`Display help.`
		138
		139
		140	`__-k__ or __--keep__`
		141
		142
		143	`Do not delete the original file.`
		144
		145
		146	`__-l__ or __--list__`
		147
		148
		149	`For each compressed file, list the following`
		150	`fields:`
		151
		152
		153	`type: dzip, gzip, or text (includes files in unknown`
		154	`formats) crc: CRC checksum date and time: from header`
		155	`chunks: number of chunks in file size: size of each`
		156	`uncompressed chunk compr.: compressed size uncompr.:`
		157	`uncompressed size ratio: compression ratio (0.0% if unknown)`
		158	`name: name of uncompressed file`
		159
		160
		161	`Unlike __gzip__, the compression method is not`
		162	`detected.`
		163
		164
		165	`__-L__ or __--license__`
		166
		167
		168	`Display the __dictzip__ license and quit.`
		169
		170
		171	`__-t__ or __--test__`
		172
		173
		174	`Check the compressed file integrity. This option is not`
		175	`implemented. Instead, it will list the header`
		176	`information.`
		177
		178
		179	`__-v__ or __--verbose__`
		180
		181
		182	`Verbose. Display extra information during`
		183	`compression.`
		184
		185
		186	`__-V__ or __--version__`
		187
		188
		189	`Version. Display the version number and compilation options`
		190	`then quit.`
		191
		192
		193	`__-s__ ''start'' or __--start__`
		194	`''start''`
		195
		196
		197	`Specify the offer to start decompression, using decimal`
		198	`numbers. The default is at the beginning of the`
		199	`file.`
		200
		201
		202	`__-e__ ''size'' or __--size__`
		203	`''size''`
		204
		205
		206	`Specify the size of the portion of the file to decompress,`
		207	`using decimal numbers. The default is the whole`
		208	`file.`
		209
		210
		211	`__-S__ ''start'' or __--Start__`
		212	`''start''`
		213
		214
		215	`Specify the offer to start decompression, using base64`
		216	`numbers. The default is at the beginning of the`
		217	`file.`
		218
		219
		220	`__-E__ ''size'' or __--Size__`
		221	`''start''`
		222
		223
		224	`Specify the size of the portion of the file to decompress,`
		225	`using base64 numbers. The default is the whole`
		226	`file.`
		227
		228
		229	`__-p__ ''prefilter'' or __--pre__`
		230	`''prefilter''`
		231
		232
		233	`Specify a shell command to execute as a filter before`
		234	`compression or decompression of a chunk. The pre- and`
		235	`post-compression filters can be used to provide additional`
		236	`compression or output formatting. The filters may not`
		237	`increase the buffer size significantly. The pre- and`
		238	`post-compression filters were designed to provide the most`
		239	`general interface possible.`
		240
		241
		242	`__-P__ ''postfilter'' or __--post__`
		243	`''postfilter''`
		244
		245
		246	`Specify a shell command to execute as a filter after`
		247	`compression or decompression.`
		248	`!!CREDITS`
		249
		250
		251	`__dictzip__ was written by Rik Faith (faith@cs.unc.edu)`
		252	`and is distributed under the terms of the GNU General Public`
		253	`License. If you need to distribute under other terms, write`
		254	`to the author.`
		255
		256
		257	`The main libraries used by this programs (zlib, regex,`
		258	`libmaa) are distributed under different terms, so you may be`
		259	`able to use the libraries for applications which are`
		260	`incompatible with the GPL -- please see the copyright`
		261	`notices and license information that come with the libraries`
		262	`for more information, and consult with your attorney to`
		263	`resolve these issues.`
		264	`!!SEE ALSO`
		265
		266
		267	`dict(1), dictd(8), gzip(1),`
		268	`gunzip(1), zcat(1)`
		269	`----`

This page is a man page (or other imported legacy content). We are unable to automatically determine the license status of this page.

Last edited on Tuesday, June 4, 2002 12:21:55 am by "perry"

Edit PageHistory Diff Info LikePages