Today, XML is on everyone’s lips. On the net, it has become the ultimate solution to all problems, the language of the 21st century.
But you have to face it: XML is still something very abstract, and sometimes it’s hard to understand its purpose!
What XML is
XML is a markup language, standardized by the World Wide Web Consortium (the originator of HTML), that defines a set of syntax rules for the structured presentation of information.
Anyone can create a language from the syntax rules dictated by this standard. That’s why we say that XML is a metalanguage: it allows you to create many more.
XML syntax rules
– XML is a markup language: a tag begins with the sign < and ends with >. There must always be an openable tag and a closeable tag. The closing tag begins with </.
o <tag></tag>
– A tag can contain text, other tags, both or nothing.
o <tag><supertag>Text</supertag> and more text</tag>
– The tags must not overlap. This is prohibited
o <tag1><tag2>Text text</tag1></tag2>
– When there is no text between two tags, you can write a shortened form
o <machine/>
– Elements may have attributes, delineated by ‘ or “.
o <mabalise attribute1=”Hello” attribute2=”all” attribute3=”the” attribute4=”world”/>
The XML galaxy
XML is useless! In fact, what is useful are all the languages that revolve around it, and that is interesting. They are part of the XML galaxy.
Those with very distant orbits do not interest us: they are languages dedicated to particular fields, such as CML, for example, which is used to describe chemical compounds or SVG for 2D vector drawing.
The namespaces
The namespace mechanism is a rather abstract notion. It is a string (an internet address) that is used to identify the language to which the tag belongs or the attribute that is part of this namespace.
It is necessary to declare the namespace using the reserved xmlns attribute. For example, for the XHTML, we do:
<html xmlns=”http://www.w3.org/1999/xhtml”>
To declare a namespace and a prefix, we use the following syntax, always with the xmlns attribute:
xmlns:prefix=”adresse-du-namespace “
We have total freedom to choose the prefix.Then, we use it like this on tags and attributes:
<prefix:prefix tag:attribute=”trick pif paf poof puff”/>
XQuery: XML as a database
There is another language to extract information from an XML document: XQuery.
It adopts a syntax very close to SQL and uses XPath to remove any data from an XML file.
It also uses Xpath to locate itself in the document.
This language also allows all the operations that you may want to do with SQL: data insertion, update, deletion, etc.
DTDs and schematics
DTDs and schemas have the same goals: to give rules for writing a specific XML document (super important in the industry for example). That is, in addition to XML basic syntax rules, DTDs and schemas add constraints on allowed elements, possible values of an attribute, their order of appearance, etc.
DOM
DOM (Document Object Model) is an API (understand: a means of programming) standardized by the W3C which builds in memory the XML document in tree form (in tree form).
From the moment this tree is a built-in memory, you can access any element, attribute, and you can also change their values, create items, attributes, clone others, delete some, etc. In short, with DOM, it is easy to manipulate an XML document, and this with different languages, both on the client and server side.