3

Extensible Markup Language

In the privious chapter we have learnt about the HTML Hyper Text Markup Language, how to create the html document? uses of lists and hypertext, how to improve the appearence of html documents using the images? and creation of forms also.

In this chapter...

Here are the topics we'll cover

What is extensible markup language? and its uses.

Study about how XML Schema is superior then Document Type Defination/ Decleration?

What makes it differ from html?


XML

XML is an acronym for Extensible Markup Language. It is a markup language derived from SGML. It is different from HTML. In html tag semantics and tags sets are fixed and rigidly defined.

An XML document is written using the markup language to contain a structured information. A document as a data file containing information and reference to data in various format and purpose including but not limited to mathematical equation, vector graphics, ecommerce transaction details, object meta-data, server APIs, and various other structured information.

Like SGML (unlike HTML). XML is a not a markup language, but a language for defining markup languages. Therefore, XML is a really nothing more than a specification. Specification means a aggreed-upon rules on how to create certain kind of documents.

Here the example of simple xml document.

All the elements are defined in .dtd file in next section.


DTD

The Document Type Definition/ Document Type Declaration (DTD) describes a model of a structure of the content of an xml document. This model says what must be present, which one are optional, what are there attribute are, and how they can be structured with relation to each other.

The Document Type Definition is an XML description of the content model of a type of documents. The document type declaration is a statement in an XML file that identify the DTD that belongs to the document. HTML has only one DTD, XML allows us to create own DTDs for the applications.

<!DOCTYPE DTD.name [internal.subset]>

Here the example of DTD file, that defines the elements of the XML document above in XML message.xml.

message elements have four child elements to, from, heading, body. Line 2-5 defines the to, from, heading, body element to be of type #PCDATA.

Notes

PCData (Perseable Character data) is used when the only text is allowed inside an element.


XML Schema

XML Schema is an XML-based alternative to DTD. It describes the structure of an XML document. The XML Schema language is also referred to as XML Schema Definition (XSD).

An XML Schema defines:

  • Elements, Attribute that can appear in a document.
  • Which elements are child elements.
  • The order and the number of the child element.
  • Data types for elements and attributes.

Successors of DTDs

XML Schemas will be used in most web applications as a replacement of DTDs. Here are the some reasons—

  • It is extensible to future additions.
  • It is regional and more powerful than DTDs.
  • It supports Data types and namespace.
  • It is written in XML.

Object Model

The XML object model is a collection of objects that you use to assess and manipulate the data stored in an xml document. The xml document is moduled after a tree in which each element in the tree is a node. Objects with various properties and method represent the tree and its nodes. Each node contains actual data in the document.


Presenting and using XML

Presenting XML involves transforming and displaying xml data in a format that is easily readable and visually appealing. This can be achieved by using several technologies—

XSLT (Extensive language transformation): It is used to transform xml document into different formats such as HTML, plain text, or other xml documents. This allows for dynamic and flexible presentation of xml data.

CSS (Cascading a style sheet): It can be applied to xml documents to extend them similarly to HTML. It can control the appearance of XML content such as colour, font, and layout. This makes the XML data more visually appealing and easier to read.

JavaScript and DOM Mainipulation: JavaScript can be used to dynamically manipulate the XML DOM (Document Object Model). This allows for interactive and real-time update to xml content displayed on a web page.

Using XML

XML can work behind the schema to simplify the creation of html document for large website.

XML can be used to exchange the information between organisation and system.

XML can be used for uploading and reloading of database. It can be used to store and arrange the data which customiges your data handling needs.

XML can easily be merged with stylesheets to create almost any desired output.

Difference between HTML and XML.

HTMLXML
This is a markup language.This is standard markup language that defines other markup language.
It is not case sensitive.It is case sensitive.
Develop as a presentation language.It is not a presentation language nor a programming language.
It has its own predefined tag.Tags are defined as per that need of the programmer, XML is flexible as tags can be defined when needed.
In HTML closing tags are not necessary.In XML closing tags are necessary.
HTML is used to display the data.It is used to explore the data.
It can ignore a small errors.It does not allow the errors.
HTML is static in nature.XML is dynamic in nature.
Some of the tools are used for the HTML are Visual Studio (VS) code, Atom, Notepad++ and many more.Some of the tools are used for xml or Oxygen XML, XML notepad and many more.

DOM and SAX

There are two essential approaches being followed by the processes. One is to read the entire document, build a tree, and then allow a programme to request any data, element from that tree.

Another approach is to provide a call back interface, so that you can invoke the parser and have predefined method involved whenever an element is encountered. This approach being interactive is very useful if reading a large document.

DOM

The Document Object Model (DOM) is a platform and language independent standard object model for presenting html or xml related formats.

The DOM is required by JavaScript scripts that wish to inspect or modify a web page dynamically. i.e. The Document Object Model is the way JavaScript sees its containing HTML page and browser state.

The DOM is a W3C (World Wide Web Consortium) standard for accessing documents like XML and HTML. The W3C Document Object Model (DOM) is a platform and language-neutral interface that allows programme and scripts to dynamically assess and update the content, structure, and style of a document.

Here the following XML DOM example.

SAX

A Simple API for XML (SAX) is a serial access parser API for XML. SAX provides a mechanism for reading data from an XML document. It is a popular alternative of Document Object Model (DOM).

SAX Parser

A Parser which implement SAX (i.e. SAX Parser) function as a stream parser, with an event-driven API. The user defined a number of callback methods that will be called when event occur during parsing. The SAX events including text nodes, element nodes, processing instruction, and comments.

The following XML document example.

Advantages

SAX parsers have certain benefit over DOM-style parsers.

The quality of memory that a SAX parsers must be used in order to function in typically much smaller than of a DOM parser.

The memory footprint of a SAX parser, is based on the maximum depth of the XML file/tree and the maximum data stored in XML attributes on a single XML element.


Next Up

4: Cascading Style Sheet

A style sheet is a collaction of formatting styles, which can be applied to a web page.

Start Chapter 4

Help us