HTML, XML and Zinging Your Data Around the Internet

Say ‘HTML’ and there’s a fair chance that people will know you’re talking about the Internet. But say ‘XML’ and you might just get a bunch of blank looks. Yet both HTML (Hypertext Markup Language) and XML (eXtensible Markup Language) play key roles in making the Web work like it does today. If you want to send your data over the Net for display on a visitor’s screen or for use in another application, one of these two standards will be a natural choice. But which one should you use, and why?

A Markup Language to Display Things Your Way

HTML was created in 1990 to standardize the way to present or format information on a webpage. A classic example of an HTML formatting command or ‘tag’ is <p>, which tells a web browser to start a new paragraph before displaying the data that follows. HTML defines many different formatting commands that web designers can use and that will be understood and (hopefully) interpreted in the same way by different web browsers. Using HTML is therefore a guarantee that content will be displayed in the same way, no matter whose web browser you use – as long as it respects the HTML standard.

Comparing HTML and XML

With HTML, you can see the results directly in a web page displayed onscreen.  But HTML, for all its rich command set, is also rigidly defined. It has to be, so that <p> for example means exactly the same thing to one browser as another. By comparison, HTML, XML has been designed to more generally help structure, store and move data from one application to another. XML can be a little harder to understand it can have very wide ranging uses, while doing much of its work hidden away from users. In particular (and unlike HTML), XML has no predefined tags. If you want to define a tag like <vitamin> in XML (suggesting that the data following the tag refers to a vitamin like A, B12, or C, for instance), then you can. But then you also need a way of telling other people (or applications) about what you’ve defined and what it means.

Document Type Definition to Explain What You Mean

The DTD or document type definition is the extra bit you add to XML to explain what you mean by that <vitamin> tag you just invented – and perhaps the <protein>, <carbohydrate> and any others, if you happen to be storing and transmitting data about foodstuffs, for instance. Somebody else could use XML to make tags about automobile parts, for instance. In fact, in the automobile industry, XML is used as a standard way to exchange data between enterprises with an accompanying DTD that is accepted as standard in that sector. That’s what the ‘eXtensible’ in XML refers to. You can extend the use of XML to store and transmit any kind of data you want, as long as you also make a DTD to go with the data, so that the receiving party knows how to understand what you’re sending.

HTML and XML – Horses for Courses

You can think of HTML as a very specific example of XML. After all, HTML also has its own document type definitions – one for each version of HTML. These DTDs are present in web browsers. When a web site or web application sends a web page, the header in the web page will tell the browser which version of HTML is being used and therefore which HTML DTD the browser should refer to in order to correctly display the information being sent. But once again, for HTML, each DTD is fixed. You cannot change it. So use HTML in particular for indicating how browsers should interpret your webpage content, and XML more generally for communicating data between different applications where you can define for yourself how the data are to be interpreted.