|
|
The idea of purely logical markup
and separating content from presentation may
sound simple and promising---until you ask
yourself, what to do with the huge pile of
existing HTML material? Can it be painlessly
adopted to XML syntax and, more importantly, to
XML's ideology of generalized markup? Read on
for some practical answers to these questions...
What XML is all about are not syntactic
innovations such as quotes around attribute
values and trailing slashes in empty tags.
XML's goal is to comprehensively mark up all
details of a given unit of information,
without mixing data belonging to different
units or different aspects of one unit.
From this viewpoint, a tag-wise conversion
of "real world" HTML, with its hopeless
medley of logical and visual elements, to
XML doesn't make any sense at all. On most
sites, HTML bears little relation not only
to the logical structure of pages, but,
properly speaking, to the presentation
aspect as well: It does not describe
formatting of the pages, but only
emulates it by using tables, invisible
spacers and similar hacks.
So, we have to forget about the XML promise
for now---until we take the trouble of
re-formulating all our data consistently, be
it in its presentation or (more importantly)
content aspects. This is well known to
those who take XML seriously and are aware
of what it can offer, and a growing number
of new document collections and software
tools are being built from ground up using
XML-inspired approaches. However, the huge
legacy of existing HTML documents needs
special treatment.
|