Reverse detail from Kakelbont MS 1, a fifteenth-century French Psalter. This image is in the public domain. Daniel Paul O'Donnell

Forward to Navigation

Using normalize-space to fix Oxygen "pretty print" spacing problems

Posted: Jun 30, 2016 17:06;
Last Modified: Jun 30, 2016 17:06

---

This is a straightforward thing for people who know what they are doing. It is only a reminder to me, who didn’t.

The journals I publish using TEI XML use the tei:figDesc element to populate the alt and title attributes of html:img.

Until today, these results in very odd looking tool tips, where the text was spread all over the place, e.g.

The problem was being caused by the OxygenXML editor’s pretty-print feature and how that was being transformed to the title and alt attributes. I was extracting the contents of the element and putting them on the attribute, this was not stripping the white space, meaning that the text showed up with strange spacing and returns.

The solution was very easy: surround the content-call with normalize-space(). Then everything worked fine. Here’s the relevant bit of XSLT:

<xsl:variable name="title">Click for full-sized image<xsl:if test="../tei:figDesc">
                        of <xsl:value-of select="../tei:figDesc"/></xsl:if></xsl:variable>
                <a href="{@url}" style="text-decoration:none;border:0px;">
                    <img alt="{../tei:figDesc}" src="{@url}" title="{normalize-space($title)}">
                        <xsl:call-template name="attcore"/>
                    </img>
----  

Extracting a catalogue of element names from a collection of XML documents using XSLT 2.0

Posted: Sep 15, 2011 17:09;
Last Modified: May 23, 2012 19:05

---

We are trying to build a single stylesheet to work with the documents of two independent journals. In order to get a sense of the work involved, we wanted to create a catalogue of all elements used in the published articles. This means loading as input document directories’ worth of files and then going through extracting and sorting the elements across all the input documents.

Here’s the stylesheet that did it for us. It is probably not maximally optimised, but it currently does what we need. Any suggestions for improvements would be gratefully received.

Some notes:

  1. Our goal was to pre-build some templates for use in a stylesheet, so we formatted the elements names into xsl templates.
  2. Although you need to use this sheet with an input document, the input document is not actually transformed (the files we are extracting the element names from are loaded using the collection() function). So it doesn’t matter what the input document is as long as it is valid XML (we used the stylesheet itself)
<?xml version="1.0"?> 
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">

<!-- this output is because we are going to construct 
ready-made templates for each element -->
    <xsl:output method="text"/>

<!-- for pretty printing -->
    <xsl:variable name="newline">
        <xsl:text> 
        </xsl:text>
    </xsl:variable>

<!-- Load the files 
in the relevant directories -->
    <xsl:variable name="allFiles"
    select="collection(iri-to-uri('file:///some/path/to/the/directories?select=*.xml;
    recurse=yes'))"/>

<!-- Dump their content into a single big pile -->
    <xsl:variable name="everything">
        <xsl:copy-of select="$allFiles"/>
    </xsl:variable>

<!-- Build a key for all elements using their name -->
    <xsl:key name="elements" match="*" use="name()"/>

<!-- Match the root node of the input document
(since the files we are actually working on have been 
loaded using the using the collection() function, nothing 
is actually going to happen to this element) -->
    <xsl:template match="/">

       <!-- this is information required to turn the output into an 
              XSL stylesheet itself -->
        <xsl:text>&lt;xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
            version="1.0"></xsl:text>
        <xsl:value-of select="$newline"/>
        <xsl:text>&lt;!--Summary of Elements --&gt;</xsl:text>
        <xsl:value-of select="$newline"/>
        <xsl:value-of select="$newline"/>

       <!-- this invokes the collection of all elements in all the files
       in the directory for further processing -->
        <xsl:for-each select="$everything">

           <!-- This makes sure we are dealing with the first named key -->
            <xsl:for-each 
            select="//*[generate-id(.)=generate-id(key('elements',name())[1])]">

               <!-- sort them -->
                <xsl:sort select="name()"/>

                <xsl:for-each select="key('elements', name())">

                   <!-- this makes sure that only the first instance 
                    of each element name is outputted -->
                    <xsl:if test="position()=1">
                        <xsl:text>&lt;xsl:template match="</xsl:text>
                        <xsl:value-of select="name()"/>
                        <xsl:text>"> </xsl:text>
                        <xsl:value-of select="$newline"/>
                        <xsl:text>&lt;!--</xsl:text>
                        <!-- this counts the remaining occurences -->
                        <xsl:value-of select="count(//*[name()=name(current())])"/>
                        <xsl:text> occurences</xsl:text>
                        <xsl:text>--&gt;</xsl:text>
                        <xsl:value-of select="$newline"/>
                        <xsl:text>&lt;/xsl:template></xsl:text>
                        <xsl:value-of select="$newline"/>
                        <xsl:value-of select="$newline"/>
                    </xsl:if>
                </xsl:for-each>
            </xsl:for-each>
        </xsl:for-each>
        <xsl:value-of select="$newline"/>
        <xsl:text>&lt;/xsl:stylesheet></xsl:text>
    </xsl:template>
</xsl:stylesheet>
----  

Back to content

Search my site

Sections

Current teaching

Recent changes to this site

Tags

anglo-saxon studies, caedmon, citation, citation practice, citations, composition, computers, digital humanities, digital pedagogy, exercises, grammar, history, moodle, old english, pedagogy, research, student employees, students, study tips, teaching, tips, tutorials, unessay, universities, university of lethbridge

See all...

Follow me on Twitter

At the dpod blog