XHTML vs XML

RandomBastard

Can't get enough of FH
Joined
Dec 28, 2003
Messages
1,318
Ok a question for all you web monkeys out there.

Whats the difference between XHTML and XML?

I've allways thought that XHTML is XML but with all the HTML tags defined. Am I wrong? Does XHTML have all of the functionality as XML just with HTML tags thrown in?

Please help me understand.

Thanks
Mr Bastard
 

fatbusinessman

Fledgling Freddie
Joined
Dec 22, 2003
Messages
810
XML is basically a file format, similar in structure to HTML, but which allows you to define the structure to be anything you want. For instance, you could define:
Code:
<pie>
<meat>Beef</meat>
<pastry taste="yummy"></pastry>
</pie>
as an XML document type.

XHTML is just the XML document type which contains all the HTML elements we know and love.

Make any sense?
 

Escape

Can't get enough of FH
Joined
Dec 26, 2003
Messages
1,643
So far as I can tell, XHTML is the current version of HTML.
There was the HTML4 standard and now instead of HTML5(or 4.x), there is XHTML.

XML gives you some more flexibility and lets you make dynamic pages by pulling in data from other sources.


I could be talking bollocks, but I think it's along those lines. At the moment, I'm looking into PHP... Installed apache+php+mysql this week and having a local webserver running is pretty cool!
 

Doh_boy

Part of the furniture
Joined
Dec 22, 2003
Messages
1,007
XHTML stands for eXtended HTML and is HTML with , suprise suprise, extra bits.

XML is a mark-up language design to define other languages and such like. It's nothing like, as far as I can tell, HTML unless you add a style sheet to it. So and XML/XSL combo would combine to be similar to a html page. XML is best, imho, for database driven sites which update content regularly.

dunno, due to not getting a proper job most of my knowledge of such stuff has totally gone :(

IT sucks atm
 

Jonty

Fledgling Freddie
Joined
Dec 22, 2003
Messages
1,411
Sorry, couldn't resist ...

Book of (Web) Genesis

In the beginning there was HTML, and the powers that be (W3C) and developers said it was good. And they were happy. HTML stood for Hyper Text Markup Language and these text files contained markup tags such as <html> and <body>. This markup helped browsers to structure the display of web pages.

So good was HTML that the powers that be created XML: eXtensible Markup Language. And it was good. Now developers could use any markup code they wished to describe any topic they wished, from cooking recipes to calculus equations. And the powers that be were happy, and said so long as certain rules were followed, XML could operate on any platform (PC, mobile phone etc.) and any operating system (Windows, Linux, Symbian etc.).

So good was XML that the powers that be created XHTML: eXtensible Hyper Text Markup Language. This took HTML, which by now had many loopholes and little consistency, and applied all the rules of XML to it. And the powers that be and developers were happy. And it was good. And so XHTML grew and gradually began to replace HTML.

XHTML Overview

hehe, okay, so basically XHTML 1.x is HTML 4.01 defined as an XML application. Essentially, all the rules of XML are applied to HTML, and these rules help to create consistency and banish all the loopholes and shortcuts which had caused so much trouble for developers. What are the rules? Well, for instance ...

  • All markup tags must be closed (so list items <li> must have a corresponding </li>)
  • Empty elements (<hr> <br> etc.) must be closed (<br></br>, but as a shorthand <br /> is used (the space is required for backwards compatibility))
  • All elements should be lowercase (<html> not <HTML>)
  • All attributes should be quoted and empty attributes should follow the normative convention (so 'style=blah' must be 'style="blah"' and <input disabled /> must be <input disabled="disabled" />)

The list goes on, but in all honesty, if you coded HTML properly, you should have no problem with XHTML.

XHTML v XML

Why not use XML for your web pages? Well, using XHTML, you are in effect using XML, it's just the tags and conventions you're using are familiar to you.

You could, of course, create your own XML document for your web pages, but there are several problems with this.

  • Web browsers would struggle to render your page properly, either applying too strict a standard or too loose a standard. With XHTML, browsers are specifically programmed to respond and render in a known and expected way.
  • Older web browsers wouldn't have a clue what to do with your document, and most likely just output the source as text.
  • Search engines look for HTML and XHTML elements (<h1> headings, <title> elements) to list and rank your pages. Using your own XML structure would damage your page's status.
  • Doing things properly, you can't just create an XML document and be done. You need to create what's known as a schema (either a .dtd or .xsd file) which helps programs know which elements go where and what rules you want to apply to your document's structure. This is yet more work.
  • Styling raw XML can be pain, as no conventions apply (the browser has default ways of rendering <p> and <h1> elements etc. in HTML and XHTML, but not for your document).
  • The browser does not know anything about your document, so even things like forms and hyperlinks need special consideration, unlike HTML and XHTML which the browser knows to respond in a certain way when it comes across certain elements (<a> <input/> etc.)

I could go on, but you get the point.

XHTML v HTML

If you're coding a new site, I'd strongly recommend working with XHTML. If you're new to everything, focus on XHTML over HTML (as they're largely synonymous, knowing XHTML will implicitly mean you'll know HTML). XHTML is the future, and offers far greater scope than HTML now can. Whilst there's nothing wrong with HTML, XHTML is designed to be cross-platform and cross-browser in a way HTML, with all its loopholes, never really achieved.

And because XHTML is extensible, you can also embed other XML formats inside your XHTML document. So you could have an XHTML web page with an SVG animation in the same source code (SVG is like Macromedia Flash except written in XML and styled in CSS) and with a complex MathML equation in the same document too (MathML is an XML way of displaying complex mathematical equations). Feats like this are simply not possible with HTML.

What to do

If I were you, I would check out W3Schools' tutorials. I know I've been mentioning this site for years, but it truly is a fabulous resource. Their tutorials are all free, and offer a gentle introduction to various topics, from XHTML and XML, to HTML and CSS. Their examples will also help to explain things far better than I can.

Kind Regards

Edit ~ Forgot to say that there is a lot more which I didn't cover above (DTDs, Strict/Tranisitional/Frameset notions, XHTML 2.0 etc.). Also, whilst Ben/Shovel is 100% right about how to serve pages, note that IE refuses to (so I'm told) render such pages as expected unless you send it as text/html. Furthermore, Mozilla goes completely the other way and whilst it renders such pages, just one mistake in your code which renders it incompliant will result in the page not being displayed and the user receiving an error.
 

Shovel

Can't get enough of FH
Joined
Dec 22, 2003
Messages
1,350
Just to add to the above:

When implemented, there is a very important difference between HTML and XHTML. The DocType. For ages and ages I didn't really worry about why this was important and continued to use "text/html" as my doctype for XHTML code. However, have a read of Sending XHTML as text/html considered harmful and you might too agree that if you're going to use XHTML over HTML for future benefits, you should set your doctype to "application/xhtml+xml" instead.

It's an interesting read.

In addition, the benefits of using XHTML are maybe not as grand as we might think right now, though there are growing benefits. Essentially, for day to day use the main benefit is that it enforces "good code". You have to structure your HTML in an elegant manner or else it wont validate. Beyond this though, no-one has really made any use of the eXtensible part of this new standard (browser support is a biggie here).

Given time though, it will become more useful so it's well worth working in it now rather than later.
 

RandomBastard

Can't get enough of FH
Joined
Dec 28, 2003
Messages
1,318
Cheers guys, I sort of had an idea about what they both were but your kind efforts mean i am no longer confused about the differences and in future will code in at least xhtml even if i dont dare stray into the boundries of xml :p
Thanks
The Bastard
 

Gef

Fledgling Freddie
Joined
Jan 9, 2004
Messages
570
Jonty said:
[*] All elements should be lowercase (<html> not <HTML>)

I always thought so long the open and close tags are the same case, its fine? I write a lot in XSL which applies the same strict standards and often have to tidy up other peoples messy HTML. If the whole page is done in uppercase tags, often I dont bother changing it. Never had any problems and it renders fine ;)

Although for legibility sake its best to keep all tags in the same case anyway!
 

Jonty

Fledgling Freddie
Joined
Dec 22, 2003
Messages
1,411
Hi Gef

You're right that XML is case-sensitive and so things don't have to be lowercase, but since the XHTML schemas etc. are written to accommodate lowercase elements and attributes only, it's best to stick with that, if only for XHTML :)

XHTML 1.0 Second Edition said:
4.2. Element and attribute names must be in lower case

XHTML documents must use lower case for all HTML element and attribute names. This difference is necessary because XML is case-sensitive e.g. <li> and <LI> are different tags.

Kind Regards

Jonty
 

Users who are viewing this thread

Top Bottom