Penguin

Differences between current version and predecessor to the previous major change of WellFormed.

Other diffs: Previous Revision, Previous Author, or view the Annotated Edit History

Newer page: version 5 Last edited on Monday, October 6, 2003 10:04:31 am by AristotlePagaltzis
Older page: version 2 Last edited on Thursday, August 21, 2003 1:17:37 pm by AristotlePagaltzis Revert
@@ -1,30 +1,34 @@
-An [XML] document is [ WellFormed] if it parses correctly. This means mainly that all tags must be closed, that they must be nested correctly, and that all attribute values must be quoted. This is not valid:  
+An [XML] document is WellFormed if  
+* all tags are closed  
+* they nest correctly  
+* all attribute values are quoted  
+* no invalid characters appear  
+* and a few more minor criteria are met
  
- <p>A paragraph <em><strong>here</em></strong>.  
- <p>And another one there. 
+This is not valid:  
+  
+ __ <p>__ A paragraph __ <em><strong>__ here__ </em></strong>__ .  
+ __ <p>__ And another one there. 
  
 The paragraph tags are not closed, and the nesting for the emphasis and strong tags is wrong. To be WellFormed, this piece of the document has to be written like so: 
  
- <p>A paragraph <strong><em>here</em></strong>.</p>  
- <p>And another one there.</p>  
-  
-In the second fragment, neither the <b> nor the <c> tag are closed. Unlike [SGML], [XML] does not allow tags to be automatically closed when the enclosing tag is closed. This is the reason why the <p> tag in [HTML]/[XHTML] gives people grief---in [HTML] you only need to put in the opened tags while in [XHTML] you need to put in both the opening and the closing tag.  
-  
-If you want non-nesting overlapping ranges you cannot use something like  
-  
- <a> ... <b> ... </a> ... </b>  
+ __ <p>__ A paragraph __ <strong><em>__ here__ </em></strong>__ .__ </p>__  
+ __ <p>__ And another one there.__ </p>__  
  
-but should use something like  
+In the second fragment, neither the __<b>__ nor the __<c>__ tag are closed. Unlike [SGML], [XML] does not allow tags to be automatically closed when the enclosing tag is closed. This is the reason why the __<p>__ tag in [HTML]/[XHTML] gives people grief -- in [HTML] you only need to put in the opened tags while in [XHTML] you need to put in both the opening and the closing tag.  
  
- <a id="1"/> ... <b id="2"/> ... < a id="1"/> .. . <b id="2"/>  
+There is no way to directly express non-nesting overlapping ranges in [XML] . While [SGML]'s [CONCUR] feature is a solution, it appears never to have been implemented, and has not been included in [XML]
  
-instead, and then you can reconstruct either of the tags as necessary.  
+If you need such, a way to express it might be something like  
  
-;: ''I disagree strongly with that practice . You are undercutting the purpose of [XML] by flattening the markup . Instead, you should use attributes to your advantage .''  
+ __<foo type="a" partof="1">__ ... __</foo>__  
+ __<foo type="a" partof="1" partof="2">__ ... __</foo>__  
+ __<foo type="b" partof="1">__ ... __</foo>__  
  
- <foo type=" a" partof="1"> ... </foo> <foo type=" a" partof="1" partof="2"> ... </foo> <foo type="b" partof="1"> ... </foo>  
+If you are certain that this is too bulky (especially when you know you have a very large number of overlapping structures), a flat alternative might look like  
  
-;: ''Yes, it's bulkier . Try in some way to process markup written with both approaches using [XSLT] and you'll notice the difference . %%% --AristotlePagaltzis''  
+ __<a id="1"/>__ ... __<b id="2"/>__ ... __<a id="1"/>__ ... __<b id="2"/>__  
  
+You can reconstruct the tags as necessary from either form. The flattened form allows to apply some kinds of diff-like reasoning which are much more difficult on trees, but the structured form is generally easier to machine process, eg using [XSLT].  
  
-[ WellFormed] [XML] differs from [Valid] [XML] in that [Valid] [XML] has been (or could be) checked against a [Schema] or [DTD]. 
+WellFormed [XML] differs from [Valid] [XML] in that [Valid] [XML] is not only WellFormed, but also has been (or could be) checked against a [Schema] or [DTD].