Blame: ispell(5) - Waikato Linux Users Group

Annotated edit history of ispell(5) version 2, including all changes. View license author blame.

Rev	Author	#	Line
1	perry	1	`ISPELL`
		2	`!!!ISPELL`
		3	`NAME`
		4	`DESCRIPTION`
		5	`EXAMPLES`
		6	`SEE ALSO`
		7	`----`
		8	`!!NAME`
		9
		10
		11	`ispell - format of ispell dictionaries and affix files`
		12	`!!DESCRIPTION`
		13
		14
		15	`''Ispell''(1) requires two files to define the language`
		16	`that it is spell-checking. The first file is a dictionary`
		17	`containing words for the language, and the second is an`
		18	`''buildhash'' (see ispell(1)) and written to a`
		19	`hash file which is not described here.`
		20
		21
		22	`A raw ''ispell'' dictionary (either the main dictionary`
		23	`or your own personal dictionary) contains a list of words,`
		24	`one per line. Each word may optionally be followed by a`
		25	`slash (`
		26	`''ispell'' was built, case may or may not be`
		27	`significant in either the root word or the flags,`
		28	`independently. Specifically, if the compile-time option`
		29	`CAPITALIZATION is defined, case is significant in the root`
		30	`word; if not, case is ignored in the root word. If the`
		31	`compile-time option MASKBITS is set to a value of 32, case`
		32	`is ignored in the flags; otherwise case is significant in`
		33	`the flags. Contact your system administrator or`
		34	`''ispell'' maintainer for more information (or use the`
		35	`__-vv__ flag to find out). The dictionary should be`
		36	`sorted with the __-f__ flag of sort(1) before the`
		37	`hash file is built; this is done automatically by`
		38	`munchlist(1), which is the normal way of building`
		39	`dictionaries.`
		40
		41
		42	`If the dictionary contains words that have string characters`
		43	`(see the affix-file documentation below), they must be`
		44	`written in the format given by the __defstringtype__`
		45	`statement in the affix file. This will be the case for most`
		46	`non-English languages. Be careful to use this format, rather`
		47	`than that of your favorite formatter, when adding words to a`
		48	`dictionary. (If you add words to your personal dictionary`
		49	`during an ''ispell'' session, they will automatically be`
		50	`converted to the correct format. This feature can be used to`
		51	`convert an entire dictionary if necessary:)`
		52
		53
		54	`echo qqqqq`
		55	`affix-file'' dummy.hash`
		56	`awk '{print ''old-dict-file'' \`
		57	`\| ispell -a -T ''old-dict-string-type'' \`
		58	`-d ./dummy.hash -p ./''new-dict-file'' \`
		59	`''`
		60
		61
		62	`The case of the root word controls the case of words`
		63	`accepted by ''ispell'', as follows:`
		64
		65
		66	`(1)`
		67
		68
		69	`If the root word appears only in lower case (e.g.,`
		70	`''bob''), it will be accepted in lower case, capitalized,`
		71	`or all capitals.`
		72
		73
		74	`(2)`
		75
		76
		77	`If the root word appears capitalized (e.g., ''Robert''),`
		78	`it will not be accepted in all-lower case, but will be`
		79	`accepted capitalized or all in capitals.`
		80
		81
		82	`(3)`
		83
		84
		85	`If the root word appears all in capitals (e.g.,`
		86	`''UNIX''), it will only be accepted all in`
		87	`capitals.`
		88
		89
		90	`(4)`
		91
		92
		93	`If the root word appears with a`
		94	`ITCorp''), a word will be`
		95	`accepted only if it follows that capitalization, or if it`
		96	`appears all in capitals.`
		97
		98
		99	`(5)`
		100
		101
		102	`More than one capitalization of a root word may appear in`
		103	`the dictionary. Flags from different capitalizations are`
		104	`combined by OR-ing them together.`
		105
		106
		107	`Redundant capitalizations (e.g., ''bob'' and ''Bob'')`
		108	`will be combined by ''buildhash'' and by ''ispell''`
		109	`(for personal dictionaries), and can be removed from a raw`
		110	`dictionary by ''munchlist''.`
		111
		112
		113	`For example, the dictionary:`
		114
		115
		116	`bob`
		117	`Robert`
		118	`UNIX`
		119	`ITcorp`
		120	`ITCorp`
		121
		122
		123	`will accept ''bob'', ''Bob'', ''BOB'',`
		124	`''Robert'', ''ROBERT'', ''UNIX'', ''ITcorp'',`
		125	`''ITCorp'', and ''ITCORP'', and will reject all`
		126	`others. Some of the unacceptable forms are ''bOb'',`
2	perry	127	`''robert'', ''Unix'', and ''!ItCorp''.`
1	perry	128
		129
		130	`As mentioned above, root words in any dictionary may be`
		131	`extended by flags. Each flag is a single alphabetic`
		132	`character, which represents a prefix or suffix that may be`
		133	`added to the root to form a new word. For example, in an`
		134	`English dictionary the __D__ flag can be added to`
		135	`''bathe'' to make ''bathed''. Since flags are`
		136	`represented as a single bit in the hashed dictionary, this`
		137	`results in significant space savings. The ''munchlist''`
		138	`script will reduce an existing raw dictionary by adding`
		139	`flags when possible.`
		140
		141
		142	`When a word is extended with an affix, the affix will be`
		143	`accepted only if it appears in the same case as the initial`
		144	`(prefix) or final (suffix) letter of the word. Thus, for`
		145	`example, the entry ''UNIX/M'' in the main dictionary`
		146	`(__M__ means add an apostrophe and an`
		147	`__UNIX'S'' but would`
		148	`reject ''UNIX's''. If ''UNIX's'' is legal, it must`
		149	`appear as a separate dictionary entry, and it will not be`
		150	`combined by ''munchlist''. (In general, you don't need to`
		151	`worry about these things; ''munchlist'' guarantees that`
		152	`its output dictionary will accept the same set of words as`
		153	`its input, so all you have to do is add words to the`
		154	`dictionary and occasionally run munchlist to reduce its`
		155	`size).`
		156
		157
		158	`As mentioned, the affix definition file describes the`
		159	`affixes associated with particular flags. It also describes`
		160	`the character set used by the language.`
		161
		162
		163	`Although the affix-definition grammar is designed for a`
		164	`line-oriented layout, it is actually a free-format yacc`
		165	`grammar and can be laid out weirdly if you want. Comments`
		166	`are started by a pound (sharp) sign (#), and continue to the`
		167	`end of the line. Backslashes are supported in the usual`
		168	`fashion (__\__''nnn'', plus specials __n__,`
		169	`__r__, __t__, __v__, __f__, __b__, and the`
		170	`new hex format __x__''nn''). Any character with`
		171	`special meaning to the parser can be changed to an`
		172	`uninterpreted token by backslashing it; for example, you can`
		173	`declare a flag named 'asterisk' or 'colon' with ''flag`
		174	`*:'' or ''flag ::''.`
		175
		176
		177	`The grammar will be presented in a top-down fashion, with`
		178	`discussion of each element. An affix-definition file must`
		179	`contain exactly one table:`
		180
		181
		182	`''table'' : [[''headers''] [[''prefixes''] [[''suffixes'']`
		183
		184
		185	`At least one of ''prefixes'' and ''suffixes'' is`
		186	`required. They can appear in either order.`
		187
		188
		189	`''headers'' : [[ ''options'' ] ''char-sets`
		190	`''`
		191
		192
		193	`The headers describe options global to this dictionary and`
		194	`language. These include the character sets to be used and`
		195	`the formatter, and the defaults for certain ''ispell''`
		196	`flags.`
		197
		198
		199	`''options'' : { ''fmtr-stmt'' \| ''opt-stmt'' \| ''flag-stmt'' \| ''num-stmt'' }`
		200
		201
		202	`The options statements define the defaults for certain`
		203	`ispell flags and for the character sets used by the`
		204	`formatters.`
		205
		206
		207	`''fmtr-stmt'' : { ''nroff-stmt'' \| ''tex-stmt'' }`
		208
		209
		210	`A ''fmtr-stmt'' describes characters that have special`
		211	`meaning to a formatter. Normally, this statement is not`
		212	`necessary, but some languages may have preempted the usual`
		213	`defaults for use as language-specific characters. In this`
		214	`case, these statements may be used to redefine the special`
		215	`characters expected by the formatter.`
		216
		217
		218	`''nroff-stmt'' : { __nroffchars__ \| __troffchars__ } ''string`
		219	`''`
		220
		221
		222	`The __nroffchars__ statement allows redefinition of`
		223	`certain ''nroff'' control characters. The string given`
		224	`must be exactly five characters long, and must list`
		225	`substitutions for the left and right parentheses`
		226	`(`
		227	`''`
		228
		229
		230	`__nroffchars__ {}.\*`
		231
		232
		233	`would replace the left and right parentheses with left and`
		234	`right curly braces for purposes of parsing`
		235	`''nroff''/''troff'' strings, with no effect on the`
		236	`others (admittedly a contrived example). Note that the`
		237	`backslash is escaped with a backslash.`
		238
		239
2	perry	240	`''tex-stmt'' : { __!TeXchars__ \| __texchars__ } ''string`
1	perry	241	`''`
		242
		243
2	perry	244	`The __!TeXchars__ statement allows redefinition of certain`
1	perry	245	`TeX/LaTeX control characters. The string given must be`
		246	`exactly thirteen characters long, and must list`
		247	`substitutions for the left and right parentheses`
		248	`(`
		249	`__`
		250
		251
		252	`__texchars__ ()[[]`
		253	`__`
		254
		255
		256	`would replace the functions of the left and right curly`
		257	`braces with the left and right angle brackets for purposes`
		258	`of parsing TeX/LaTeX constructs, while retaining their`
		259	`functions for the ''tib'' bibliographic preprocessor.`
		260	`Note that the backslash, the left square bracket, and the`
		261	`right angle bracket must be escaped with a`
		262	`backslash.`
		263
		264
		265	`''opt-stmt'' : { ''cmpnd-stmt'' \| ''aff-stmt'' }`
		266	`''cmpnd-stmt'' : __ compoundwords__ ''compound-opt`
		267	`aff-stmt'' : __ allaffixes__ ''on-or-off`
		268	`on-or-off'' : { __on__ \| __off__ }`
		269	`''compound-opt'' : { ''on-or-off'' \| __controlled__ ''character'' }`
		270
		271
		272	`An ''opt-stmt'' controls certain ispell defaults that are`
		273	`best made language-specific. The __allaffixes__ statement`
		274	`controls the default for the __-P__ and __-m__ options`
		275	`to ''ispell.'' If __allaffixes__ is turned __off__`
		276	`(the default), ''ispell'' will default to the behavior of`
		277	`the ''-P'' flag: root/affix suggestions will only be made`
		278	`if there are no`
		279	`''allaffixes__ is turned __on__, ''ispell'' will`
		280	`default to the behavior of the ''-m'' flag: root/affix`
		281	`suggestions will always be made. The __compoundwords__`
		282	`statement controls the default for the __-B__ and`
		283	`__-C__ options to ''ispell.'' If __compoundwords__`
		284	`is turned __off__ (the default), ''ispell'' will`
		285	`default to the behavior of the ''-B'' flag: run-together`
		286	`words will be reported as errors. If __compoundwords__ is`
		287	`turned __on__, ''ispell'' will default to the behavior`
		288	`of the ''-C'' flag: run-together words will be considered`
		289	`as compounds if both are in the dictionary. This is useful`
		290	`for languages such as German and Norwegian, which form large`
		291	`numbers of compound words. Finally, if __compoundwords__`
		292	`is set to ''controlled'', only words marked with the flag`
		293	`indicated by ''character'' (which should not be otherwise`
		294	`used) will be allowed to participate in compound formation.`
		295	`Because this option requires the flags to be specified in`
		296	`the dictionary, it is not available from the command`
		297	`line.`
		298
		299
		300	`''flag-stmt'' : __ flagmarker__ ''character`
		301	`''`
		302
		303
		304	`The __flagmarker__ statement describes the character`
		305	`which is used to separate affix flags from the root word in`
		306	`a raw dictionary file. This must be a character which is not`
		307	`found in any word (including in string characters; see`
		308	`below). The default is`
		309	`__`
		310
		311
		312	`''num-stmt'' : __ compoundmin__ ''digit`
		313	`''`
		314
		315
		316	`The __compoundmin__ statement controls the length of the`
		317	`two components of a compound word. This only has an effect`
		318	`if __compoundwords__ is turned __on__ or if the`
		319	`__-C__ flag is given to ''ispell''. In that case, only`
		320	`words at least as long as the given minimum will be accepted`
		321	`as components of a compound. The default is 3`
		322	`characters.`
		323
		324
		325	`''char-sets'' : '' norm-sets'' [[ ''alt-sets'' ]`
		326
		327
		328	`The character-set section describes the characters that can`
		329	`be part of a word, and defines their collating order. There`
		330	`must always be a definition of`
		331
		332
		333	`''norm-sets'' : [[ ''deftype'' ] charset-group`
		334
		335
		336	`A`
		337
		338
		339	`''deftype'' : __defstringtype__ ''name deformatter suffix''*`
		340
		341
		342	`The __defstringtype__ declaration gives a list of file`
		343	`suffixes which should make use of the default string`
		344	`characters defined as part of the base character set; it is`
		345	`only necessary if string characters are being defined. The`
		346	`''name'' parameter is a string giving the unique name`
		347	`associated with these suffixes; often it is a formatter`
		348	`name. If the formatter is a member of the troff family,`
		349	`''ispell 's'' __-T__ switch to specify a formatter`
		350	`type. The ''deformatter'' parameter specifies the`
		351	`deformatting style to use when processing files with the`
		352	`given suffixes. Currently, this must be either __tex__ or`
		353	`__nroff__. The ''suffix'' parameters are a`
		354	`whitespace-separated list of strings which, if present at`
		355	`the end of a filename, indicate that the associated set of`
		356	`string characters should be used by default for this file.`
		357	`For example, the suffix list for the troff family typically`
		358	`includes suffixes such as`
		359	`''`
		360
		361
		362	`''charset-group'' : { ''char-stmt'' \| ''string-stmt'' \| ''dup-stmt''}*`
		363
		364
		365	`A ''char-stmt'' describes single characters; a`
		366	`''string-stmt'' describes characters that must appear`
		367	`together as a string, and which usually represent a single`
		368	`character in the target language. Either may also describe`
		369	`conversion between upper and lower case. A ''dup-stmt''`
		370	`is used to describe alternate forms of string characters, so`
		371	`that a single dictionary may be used with several formatting`
		372	`programs that use different conventions for representing`
		373	`non-ASCII characters.`
		374
		375
		376	`''char-stmt'' : __ wordchars__ ''character-range`
		377	`'' \| __ wordchars__ ''lowercase-range uppercase-range`
		378	`'' \| __ boundarychars__ ''character-range`
		379	`'' \| __ boundarychars__ ''lowercase-range uppercase-range`
		380	`string-stmt'' : __ stringchar__ ''string`
		381	`'' \| __ stringchar__ ''lowercase-string uppercase-string`
		382	`''`
		383
		384
		385	`Characters described with the __boundarychars__ statement`
		386	`are considered part of a word only if they appear singly,`
		387	`embedded between characters declared with the`
		388	`__wordchars__ or __stringchar__ statements. For`
		389	`example, if the hyphen is a boundary character (useful in`
		390	`French), the string`
		391	`__`
		392
		393
		394	`If two ranges or strings are given in a ''char-stmt'' or`
		395	`''string-stmt'', the first describes characters that are`
		396	`interpreted as lowercase and the second describes uppercase.`
		397	`In the case of a __stringchar__ statement, the two`
		398	`strings must be of the same length. Also, in a`
		399	`__stringchar__ statement, the actual strings may contain`
		400	`both uppercase and characters themselves without difficulty;`
		401	`for instance, the statement`
		402
		403
		404	`stringchar`
		405
		406
		407	`is legal and will not interfere with (or be interfered with`
		408	`by) other declarations of of`
		409
		410
		411	`A final note on string characters: some languages collate`
		412	`certain special characters as if they were strings. For`
		413	`example, the German`
		414
		415
		416	`''alt-sets'' : '' alttype'' [[ ''alt-stmt''* ]`
		417
		418
		419	`Because different formatters use different notations to`
		420	`represent non-ASCII characters, ''ispell'' must be aware`
		421	`of the representations used by these formatters. These are`
		422	`declared as alternate sets of string`
		423	`characters.`
		424
		425
		426	`''alttype'' : __ altstringtype__ ''name suffix''*`
		427
		428
		429	`The __altstringtype__ statement introduces each set by`
		430	`declaring the associated formatter name and filename suffix`
		431	`list. This name and list are interpreted exactly as in the`
		432	`__defstringtype__ statement above. Following this header`
		433	`are one or more ''alt-stmt''s which declare the alternate`
		434	`string characters used by this formatter.`
		435
		436
		437	`''alt-stmt'' : __ altstringchar__ ''alt-string std-string`
		438	`''`
		439
		440
		441	`The ''altstringchar'' statement describes alternate`
		442	`representations for string characters. For example, the -mm`
		443	`macro package of ''troff'' represents the German`
		444	`''a*:'', while ''TeX'' uses`
		445	`the sequence ''''. If the ''troff'' versions`
		446	`are declared as the standard versions using`
		447	`__stringchar__, the ''TeX'' versions may be declared`
		448	`as alternates by using the statement`
		449
		450
		451	`altstringchar \`
		452
		453
		454	`When the __altstringchar__ statement is used to specify`
		455	`alternate forms, all forms for a particular formatter must`
		456	`be declared together as a group. Also, each formatter or`
		457	`macro package must provide a complete set of characters,`
		458	`both upper- and lower-case, and the character sequences used`
		459	`for each formatter must be completely distinct. Character`
		460	`sequences which describe upper- and lower-case versions of`
		461	`the same printable character must also be the same length.`
		462	`It may be necessary to define some new macros for a given`
		463	`formatter to satisfy these restrictions. (The current`
		464	`version of ''buildhash'' does not enforce these`
		465	`restrictions, but failure to obey them may result in errors`
		466	`being introduced into files that are processed with`
		467	`''ispell''.)`
		468
		469
		470	`An important minor point is that ''ispell'' assumes that`
		471	`all characters declared as __wordchars__ or`
		472	`__boundarychars__ will occupy exactly one position on the`
		473	`terminal screen.`
		474
		475
		476	`A single character-set statement can declare either a single`
		477	`character or a contiguous range of characters. A range is`
		478	`given as in egrep and the shell: [[a-z] means lowercase`
		479	`alphabetics; [[^a-z] means all but lowercase, etc. All`
		480	`character-set statements are combined (unioned) to produce`
		481	`the final list of characters that may be part of a word. The`
		482	`collating order of the characters is defined by the order of`
		483	`their declaration; if a range is used, the characters are`
		484	`considered to have been declared in ASCII order. Characters`
		485	`that have case are collated next to each other, with the`
		486	`uppercase character first.`
		487
		488
		489	`The character-declaration statements have a rather strange`
		490	`behavior caused by its need to match each lowercase`
		491	`character with its uppercase equivalent. In any given`
		492	`__wordchars__ or __boundarychars__ statement, the`
		493	`characters in each range are first sorted into ASCII`
		494	`collating sequence, then matched one-for-one with the other`
		495	`range. (The two ranges must have the same number of`
		496	`characters). Thus, for example, the two`
		497	`statements:`
		498
		499
		500	`__wordchars__ [[aeiou] [[AEIOU]`
		501	`__wordchars__ [[aeiou] [[UOIEA]`
		502
		503
		504	`would produce exactly the same effect. To get the vowels to`
		505	`match up`
		506
		507
		508	`__wordchars__ a U`
		509	`__wordchars__ e O`
		510	`__wordchars__ i I`
		511	`__wordchars__ o E`
		512	`__wordchars__ u A`
		513
		514
		515	`which would cause uppercase 'e' to be 'O', and lowercase 'O'`
		516	`to be 'e'. This should normally be a problem only with`
		517	`languages which have been forced to use a strange ASCII`
		518	`collating sequence. If your uppercase and lowercase letters`
		519	`both collate in the same order, you shouldn't have to worry`
		520	`about this`
		521
		522
		523	`The prefixes and suffixes sections have exactly the same`
		524	`syntax, except for the introductory keyword.`
		525
		526
		527	`''prefixes'' : __ prefixes__ ''flagdef''*`
		528	`''suffixes'' : __ suffixes__ ''flagdef''*`
		529	`''flagdef'' : __ flag__ [[____\|__~__] ''char'' __:__ ''repl''`
		530
		531
		532	`A prefix or suffix table consists of an introductory keyword`
		533	`and a list of flag definitions. Flags can be defined more`
		534	`than once, in which case the definitions are combined. Each`
		535	`flag controls one or more ''repl''s (replacements) which`
		536	`are conditionally applied to the beginnings or endings of`
		537	`various words.`
		538
		539
		540	`Flags are named by a single character ''char''. Depending`
		541	`on a configuration option, this character can be either any`
		542	`uppercase letter (the default configuration) or any 7-bit`
		543	`ASCII character. Most languages should be able to get along`
		544	`with just 26 flags.`
		545
		546
		547	`A flag character may be prefixed with one or more option`
		548	`characters. (If you wish to use one of the option characters`
		549	`as a flag character, simply enclose it in double`
		550	`quotes.)`
		551
		552
		553	`The asterisk (__*__) option means that this flag`
		554	`participates in ''cross-product'' formation. This only`
		555	`matters if the file contains both prefix and suffix tables.`
		556	`If so, all prefixes and suffixes marked with an asterisk`
		557	`will be applied in all cross-combinations to the root word.`
		558	`For example, consider the root ''fix'' with prefixes`
		559	`''pre'' and ''in'', and suffixes ''es'' and`
		560	`''ed''. If all flags controlling these prefixes and`
		561	`suffixes are marked with an asterisk, then the single root`
		562	`''fix'' would also generate ''prefix'',`
		563	`''prefixes'', ''prefixed'', ''infix'',`
		564	`''infixes'', ''infixed'', ''fix'', ''fixes'',`
		565	`and ''fixed''. Cross-product formation can produce a`
		566	`large number of words quickly, some of which may be illegal,`
		567	`so watch out. If cross-products produce illegal words,`
		568	`''munchlist'' will not produce those flag combinations,`
		569	`and the flag will not be useful.`
		570
		571
		572	`''repl'' : '' condition''* ____ [[ __-__ ''strip-string'' __,__ ] ''append-string`
		573	`''`
		574
		575
		576	`The __~__ option specifies that the associated flag is`
		577	`only active when a compound word is being formed. This is`
		578	`useful in a language like German, where the form of a word`
		579	`sometimes changes inside a compound.`
		580
		581
		582	`A ''repl'' is a conditional rule for modifying a root`
		583	`word. Up to 8 ''conditions'' may be specified. If the`
		584	`''conditions'' are satisfied, the rules on the right-hand`
		585	`side of the ''repl'' are applied, as`
		586	`follows:`
		587
		588
		589	`(1)`
		590
		591
		592	`If a strip-string is given, it is first stripped from the`
		593	`beginning or ending (as appropriate) of the root`
		594	`word.`
		595
		596
		597	`(2)`
		598
		599
		600	`Then the append-string is added at that point.`
		601
		602
		603	`For example, the ''condition'' __.__ means`
		604	`__condition'' __Y__ means`
		605	`__`
		606
		607
		608	`.`
		609
		610
		611	`would change ''induce'' to ''inducement'' and`
		612	`''fly'' to ''flies''. (If they were controlled by the`
		613	`same flag, they would also change ''fly'' to`
		614	`''flyment'', which might not be what was wanted.`
		615	`''Munchlist'' can be used to protect against this sort of`
		616	`problem; see the command sequence given below.)`
		617
		618
		619	`No matter how much you might wish it, the strings on the`
		620	`right must be strings of specific characters, not ranges.`
		621	`The reasons are rooted deeply in the way ''ispell''`
		622	`works, and it would be difficult or impossible to provide`
		623	`for more flexibility. For example, you might wish to`
		624	`write:`
		625
		626
		627	`[[EY]`
		628
		629
		630	`This will not work. Instead, you must use two separate`
		631	`rules:`
		632
		633
		634	`E`
		635
		636
		637	`The application of ''repl''s can be restricted to certain`
		638	`words with ''conditions'':`
		639
		640
		641	`''condition'' : { __.__ \| ''character'' \| ''range'' }`
		642
		643
		644	`A ''condition'' is a restriction on the characters that`
		645	`adjoin, and/or are replaced by, the right-hand side of the`
		646	`''repl''. Up to 8 ''conditions'' may be given, which`
		647	`should be enough context for anyone. The right-hand side`
		648	`will be applied only if the ''conditions'' in the`
		649	`''repl'' are satisfied. The ''conditions'' also`
		650	`implicitly define a length; roots shorter than the number of`
		651	`''conditions'' will not pass the test. (As a special`
		652	`case, a ''condition'' of a single dot`
		653	`''`
		654
		655
		656	`''Conditions'' that are single characters should be`
		657	`separated by white space. For example, to specify words`
		658	`ending in ''`
		659
		660
		661	`E D`
		662
		663
		664	`If you write:`
		665
		666
		667	`ED`
		668
		669
		670	`the effect will be the same as:`
		671
		672
		673	`[[ED]`
		674
		675
		676	`As a final minor, but important point, it is sometimes`
		677	`useful to rebuild a dictionary file using an incompatible`
		678	`suffix file. For example, suppose you expanded the`
		679	`newdict'' that, using`
		680	`''newaffixes'', will accept exactly the same list of`
		681	`words as the old list ''olddict'' did using`
		682	`''oldaffixes'', the __-c__ switch of ''munchlist''`
		683	`is useful, as in the following example:`
		684
		685
		686	`$ munchlist -c oldaffixes -l newaffixes olddict`
		687
		688
		689	`If you use this procedure, your new dictionary will always`
		690	`accept the same list the original did, even if you badly`
		691	`screwed up the affix file. This is because ''munchlist''`
		692	`compares the words generated by a flag with the original`
		693	`word list, and refuses to use any flags that generate`
		694	`illegal words. (But don't forget that the ''munchlist''`
		695	`step takes a long time and eats up temporary file`
		696	`space).`
		697	`!!EXAMPLES`
		698
		699
		700	`As an example of conditional suffixes, here is the`
		701	`specification of the __S__ flag from the English affix`
		702	`file:`
		703
		704
		705	`flag *S:`
		706	`[[^AEIOU]Y`
		707
		708
		709	`The first line applies to words ending in Y, but not in`
		710	`vowel-Y. The second takes care of the vowel-Y words. The`
		711	`third then handles those words that end in a sibilant or`
		712	`near-sibilant, and the last picks up everything`
		713	`else.`
		714
		715
		716	`Note that the ''conditions'' are written very carefully`
		717	`so that they apply to disjoint sets of words. In particular,`
		718	`note that the fourth line excludes words ending in Y as well`
		719	`as the obvious SXZH. Otherwise, it would convert`
		720	`''`
		721
		722
		723	`Although the English affix file does not do so, you can also`
		724	`have a flag generate more than one variation on a root word.`
		725	`For example, we could extend the English`
		726
		727
		728	`flag *R:`
		729	`E`
		730
		731
		732	`This flag would generate both`
		733	`!!SEE ALSO`
		734
		735
		736	`ispell(1)`
		737	`----`

This page is a man page (or other imported legacy content). We are unable to automatically determine the license status of this page.

Last edited on Tuesday, June 4, 2002 12:30:37 am by "perry"

Edit PageHistory Diff Info LikePages