Penguin
Note: You are viewing an old revision of this page. View the current version.

Acronym for Standard Generalized Markup Language.

It evolved from earlier generalized markup languages developed at IBM, including General Markup Language (GML) and ISIL. Despite the name, it's not itself a markup language, but a description of how to specify one. Such a specification is called a DTD; a good example for MetaData. The premier markup language defined in terms of SGML is HTML. While disregarded in HTML for a long time, the philosophy of SGML is that documents should be described in terms of their structure, rather than the "document image". They can then be displayed or output in any form appropriate for any media.

However, the standard is excessively complex, so no SGML parser to date supports all of the standard's features. Many of these were added to cater to humans writing SGML documents manually, but more and more markup is machine generated. This situation eventually led to the creation of XML.

sgmlnorm(1) can be used to check whether SGML documents validate against their DTD. F.ex, if a HTML file contains a proper document type declaration like

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">

then sgmlnorm(1) can validate it against W3C's strict.dtd and report any violations (such as having text not contained in <P> or <DIV> tags). Debian users can install the sp package to get this and other SGML parsing tools.