Penguin
Annotated edit history of xmlwf(1) version 1, including all changes. View license author blame.
Rev Author # Line
1 perry 1 XMLWF
2 !!!XMLWF
3 NAME
4 SYNOPSIS
5 DESCRIPTION
6 WELL-FORMED DOCUMENTS
7 OPTIONS
8 OUTPUT
9 BUGS
10 ALTERNATIVES
11 SEE ALSO
12 AUTHOR
13 ----
14 !!NAME
15
16
17 xmlwf -- Determines if an XML document is well-formed
18 !!SYNOPSIS
19
20
21 __xmlwf__ [[__-s__] [[__-n__] [[__-p__] [[__-x__]
22 [[__-e__ ''encoding] [[''__-w__''] [[''__-d__
23 ''output-dir] [[''__-c__''] [[''__-m__'']
24 [[''__-r__''] [[''__-t__''] [[file
25 ...]''
26 !!DESCRIPTION
27
28
29 __xmlwf__ uses the Expat library to determine if an XML
30 document is well-formed. It is non-validating.
31
32
33 If you do not specify any files on the command-line, and you
34 have a recent version of xmlwf, the input file will be read
35 from stdin.
36 !!WELL-FORMED DOCUMENTS
37
38
39 A well-formed document must adhere to the following
40 rules:
41
42
43 The file begins with an XML declaration. For instance,
44 __
45 __. ''NOTE:'' xmlwf does
46 not currently check for a valid XML
47 declaration.
48
49
50 Every start tag is either empty (
51
52
53 There is exactly one root element. This element must contain
54 all other elements in the document. Only comments, white
55 space, and processing instructions may come after the close
56 of the root element.
57
58
59 All elements nest properly.
60
61
62 All attribute values are enclosed in quotes (either single
63 or double).
64
65
66 If the document has a DTD, and it strictly complies with
67 that DTD, then the document is also considered ''valid''.
68 xmlwf is a non-validating parser -- it does not check the
69 DTD. However, it does support external entities (see the -x
70 option).
71 !!OPTIONS
72
73
74 When an option includes an argument, you may specify the
75 argument either separate (
76
77
78 __-c__ If the input file is well-formed and xmlwf doesn't
79 encounter any errors, the input file is simply copied to the
80 output directory unchanged. This implies no namespaces
81 (turns off -n) and requires -d to specify an output
82 file.
83
84
85 __-d output-dir__
86
87
88 Specifies a directory to contain transformed representations
89 of the input files. By default, -d outputs a canonical
90 representation (described below). You can select different
91 output formats using -c and -m.
92
93
94 The output filenames will be exactly the same as the input
95 filenames or
96 cat
97 in most shells).
98
99
100 Two structurally equivalent XML documents have a
101 byte-for-byte identical canonical XML representation. Note
102 that ignorable white space is considered significant and is
103 treated equivalently to data. More on canonical XML can be
104 found at http://www.jclark.com/xml/canonxml.html
105 .
106
107
108 __-e encoding__
109
110
111 Specifies the character encoding for the document,
112 overriding any document encoding declaration. xmlwf has four
113 built-in encodings: __US-ASCII__, __UTF-8__,
114 __UTF-16__, and __ISO-8859-1__. Also see the -w
115 option.
116
117
118 __-m__ Outputs some strange sort of XML file that
119 completely describes the the input file, including character
120 postitions. Requires -d to specify an output
121 file.
122
123
124 __-n__ Turns on namespace processing. (describe
125 namespaces) -c disables namespaces.
126
127
128 __-p__ Tells xmlwf to process external DTDs and parameter
129 entities.
130
131
132 Normally xmlwf never parses parameter entities. -p tells it
133 to always parse them. -p implies -x.
134
135
136 __-r__ Normally xmlwf memory-maps the XML file before
137 parsing. -r turns off memory-mapping and uses normal file IO
138 calls instead. Of course, memory-mapping is automatically
139 turned off when reading from STDIN.
140
141
142 __-s__ Prints an error if the document is not standalone.
143 A document is standalone if it has no external subset and no
144 references to parameter entities.
145
146
147 __-t__ Turns on timings. This tells Expat to parse the
148 entire file, but not perform any processing. This gives a
149 fairly accurate idea of the raw speed of Expat itself
150 without client overhead. -t turns off most of the output
151 options (-d, -m -c, ...)
152
153
154 __-w__ Enables Windows code pages. Normally, xmlwf will
155 throw an error if it runs across an encoding that it is not
156 equipped to handle itself. With -w, xmlwf will try to use a
157 Windows code page. See also -e.
158
159
160 __-x__ Turns on parsing external entities.
161
162
163 Non-validating parsers are not required to resolve external
164 entities, or even expand entities at all. Expat always
165 expands internal entities (?), but external entity parsing
166 must be enabled explicitly.
167
168
169 External entities are simply entities that obtain their data
170 from outside the XML file currently being
171 parsed.
172
173
174 This is an example of an internal entity:
175
176
177
178
179 And here are some examples of external
180 entities:
181
182
183
184
185 __--__ For some reason, xmlwf specifically ignores
186 __
187
188
189 Older versions of xmlwf do not support reading from
190 STDIN.
191 !!OUTPUT
192
193
194 If an input file is not well-formed, xmlwf outputs a single
195 line describing the problem to STDOUT. If a file is well
196 formed, xmlwf outputs nothing. Note that the result code is
197 ''not'' set.
198 !!BUGS
199
200
201 According to the W3C standard, an XML file without a
202 declaration at the beginning is not considered well-formed.
203 However, xmlwf allows this to pass.
204
205
206 xmlwf returns a 0 - noerr result, even if the file is not
207 well-formed. There is no good way for a program to use xmlwf
208 to quickly check a file -- it must parse xmlwf's
209 STDOUT.
210
211
212 The errors should go to STDERR, not stdout.
213
214
215 There should be a way to get -d to send its output to STDOUT
216 rather than forcing the user to send it to a
217 file.
218
219
220 I have no idea why anyone would want to use the -d, -c and
221 -m options. If someone could explain it to me, I'd like to
222 add this information to this manpage.
223 !!ALTERNATIVES
224
225
226 Here are some XML validators on the web:
227
228
229 http://www.hcrc.ed.ac.uk/~richard/xml-check.html
230 http://www.stg.brown.edu/service/xmlvalid/
231 http://www.scripting.com/frontier5/xml/code/xmlValidator.html
232 http://www.xml.com/pub/a/tools/ruwf/check.html
233 (on a page with no less than 15 ads! Shame!)
234 !!SEE ALSO
235
236
237 The Expat home page: http://expat.sourceforge.net/
238 The W3 XML specification: http://www.w3.org/TR/REC-xml
239 !!AUTHOR
240
241
242 This manual page was written by Scott Bronson
243 bronson@rinspin.com for the __Debian GNU/Linux__ system
244 (but may be used by others). Permission is granted to copy,
245 distribute and/or modify this document under the terms of
246 the GNU Free Documentation License, Version
247 1.1.
248 ----
This page is a man page (or other imported legacy content). We are unable to automatically determine the license status of this page.