My name is Roman Bleier and in this video
I'm going to show how to access the XML date of birth document. So first of
all let's create a new word document. Here we have our docx file and let's give it
a name, XML languages in this case. I open up my word document, there we go.
And now I would like to add a little bit of text. First let's give it a heading,
XML languages. Then set this to Heading 1. Next I add a bit more text, let's make a
list, let's make a list of XML languages; TEI, XHTML, SVG, and RDF.
Now highlight all and make an unordered list or an ordered list. So I save the word document and I close it. Now here we have our word
document. Next what we want to do is, basically a word document is a zip file
but it has a docx extension, dot docx extension and that's the reason why
Microsoft Word is opening it as default program. If we change the file extension
to zip, you get first a message asking us if you really want to do this we say yes. So if we change the file extension to zip file to zip we can open up a
Microsoft Word file like a zip archive. And what we get is a zip archive that
we can extract and we get a folder full of XML documents. So if we open up this
folder now we see their subfolders in here and these subfolders contain XML
files. So let's open up the word subfolder and in here we have a bunch of
XML documents and we want to open up the file document dot XML, which contains
basically the data we are looking for. And so this is the content data here. I
open the file with a very simple text editor notepad and it
displays the XML, basically it's a very very long line of text. But you can
see at the very end there is the text that we added in. Here we have XHTML,
then there's a lot of markup, this is called XML markup, word specific XML
markup in that case. And here's TEI and as you can see TEI is included in
these elements XML elements. So we can better see the structure of
the XML document when you open the file with XML editor, in my case I use oXygen.
I open it now by double clicking but you can also open the file by drag and drop,
you can just simply drag and drop the XML file into oxygen and it will open up. And in oxygen you can see that the
syntax is basically highlighted, the XML syntax, and here we have our heading, XML
languages, and we can make out that there is a heading that is basically a heading
by the markup that is surrounding it. So here we have paragraph style and
and there is our paragraph style text that is encoded in paragraph style. TEI, then we have here XHTML, SVG, and finally our RDF. So there's our text and the markup surrounding it
basically tells word how to display the text. So if we close this now,
let's not save anything. So we didn't make any changes. You can
simply also close this, also close the Explorer. And because Microsoft Word
documents are simply zip files, if we change now the the zip extension pack to
docx we get again a word document. And if you open it up there we have our text
again. It can be opened up like any Word document.