Diff: LZW - Waikato Linux Users Group

Differences between version 2 and previous revision of LZW.

Other diffs: Previous Major Revision, Previous Author, or view the Annotated Edit History

Newer page:	version 2	Last edited on Tuesday, September 10, 2002 8:22:53 pm	by CraigBox	Revert
Older page:	version 1	Last edited on Tuesday, September 10, 2002 8:17:30 pm	by CraigBox	Revert

@@ -7,15 +7,36 @@

The decompressor needs nothing more than this sequence of output and a knowledge of how the dictionary is built to be able to built the dictionary itself, and therefore decompress the file.

For each string that LZ77 compresses, it outputs an index back into to the input (go back N character), a window size (copy these characters), and the first mismatching character.

-Then, in 1978, they came up with a better algorithm, that only required the compressor to output a phrase number and a mismatching character. The phrase extended with the mismatching character would then be added to the dictionary.

+Then, in 1978, they came up with a better algorithm (dubbed [LZ78|http://www.rasip.fer.hr/research/compress/algorithms/fund/lz/lz78.html] , that only required the compressor to output a phrase number and a mismatching character. The phrase extended with the mismatching character would then be added to the dictionary.

LZ77/78 encoding remained the de-facto standard for compression of data (along with some cool stuff like HuffmanCoding) until 1984, where Terry __W__elch published "A Technique for High-Performance Data Compression" in Computer magazine. He realised that with a dictionary that had all the possible characters in it already, you wouldn't have to output the mismatching character like in LZ78; you could output a phrase number that matched it. This algorithm is called __LZW encoding__ (Lempel, Ziv, Welch).

!How it works

The algorithm is really quite simple, and will be described here.

+

+* Start with a dictionary that contains all possible characters.

+* P refers to the previous phrase (initially empty).

+* Read a character, C.

+

+* Is the string P+C in the dictionary? (It is - P is empty so P+C = C)

+* Output the code number for string C to the codestream

+* Set the previous string P = C.

+

+* While there are still character remaining

+* Read a character

+* Is the string P+C present in the dictionary?

+

+* * If yes, set P to P+C

+* * If no,

+# output the code word for P

+# add the string P+C to the dictionary

+# Set P to C

+

+* When we run out of characters, output the code word for the string P and end.

+

A really good link to LZW is http://www.rasip.fer.hr/research/compress/algorithms/fund/lz/lzw.html.