To use a dictionary with Kirrkirr, you must provide at least one XSLT stylesheet file that can render dictionary entries in HTML (the format used by web browsers and for Kirrkirr entry text display). Our dictionaries usually provide several, so you can view dictionary entries in different ways. Most (recent) books on XML contain some coverage of XSLT, and there are several books just on XSLT (if you're looking to buy one, a good one is Michael Kay's XSLT: Programmer's Reference). There are also some introductions to XSLT on the web (such as w3schools). XSLT specifies rules to transduce tree structures. It's actually a very linguist-friendly notion. Transformational grammar might have turned out better if tree transductions XSLT-style were the norm in the 1960s.
The XML file passed to your XSLT stylesheet is basically just a fragment of your dictionary (recall the assumption that the dictionary must be represented in XML as a list of words). That is, the XML file will have the complete structure of the dictionary, but it will be a dictionary with just one word left in it in L1 mode, or perhaps a few entries if in L2 mode or if exporting dictionary data to HTML. So, the XSLT file should be built to work with your dictionary structure (indeed, it should work if run on your entire dictionary).
The main Kirrkirr-specific thing you need to know is how to get links
that work between dictionary entries, so that you can click on a word
in one entry (such as a word listed as a synonym) and move to that
entry. Essentially one wants to build up an "A HREF" element, just like
when writing an HTML page, but a particular encoding is used for a
word. When the hyperlink is clicked, Kirrkirr intercepts the link
request, and creates the necessary HTML file, and updates other panels
to display the requested word. The file to request for a word, say,
xyzzy is @xyzzy@uniquifier.html
. If there is no
uniquifier (i.e., there aren't homophones to deal with), it's just
@xyzzy@.html
. (Again, the xyzzy
part should
be in UTF-8 encoding.) Note that such links cannot work outside running
Kirrkirr, because HTML files for each entry do not exist permanently on
disk - dealing with this for exported HTML files is discussed below.
To be very concrete, if links to other words are in a
LINKTO
element in your XML, and the uniquifier is stored in
an HNUM
attribute (when needed), then the following XSLT will
produce working links in the HTML (within Kirrkirr).
<xsl:template match="LINKTO"> <xsl:choose> <xsl:when test='@HNUM'> <A> <xsl:attribute name="HREF">@<xsl:value-of select="."/>@<xsl:value-of select="@HNUM"/>.html</xsl:attribute> <xsl:apply-templates/> </A> </xsl:when> <xsl:otherwise> <A> <xsl:attribute name="HREF">@<xsl:value-of select="."/>@.html</xsl:attribute> <xsl:apply-templates/> </A> </xsl:otherwise> </xsl:choose> </xsl:template>
As an advanced feature, the XML files passed through to the XSLT stylesheet add a couple of elements which you can regard as parameters that might allow you to produce better XML output. These are added as direct descendants of the root element of your dictionary. (Advanced note: note that this means that the XML files passed to the XSLT file will not strictly satisfy any DTD that you may have defined for the dictionary.) This information can be used by the XSLT file to provide links to pictures and sounds that work within HTML.
The full path names here are particularly useful when the user exports an HTML file from Kirrkirr, since one can make links to pictures and sounds that will work (assuming that no one moves any files subsequently!). These elements are added automatically to XML files, based on the path to the dictionary that you are currently using in Kirrkirr.<DIR> <IDIR>file:C:\Kirrkirr\distrib\MiniDict\images\</IDIR> <SDIR>file:C:\Kirrkirr\distrib\MiniDict\audio\</SDIR> </DIR>
When the user
is exporting a portion of the dictionary as an HTML file,
a USERFILE element appears in the XML file. In this
situation, one cannot successfully make links to words outside the
wordlist (because HTML files for all words are not permanently
materialized). You can test for the presence of this element to disable
the generation of links (at least to words not in the wordlist).
For instance, one can add a clause such as the following to the example
above, which will always disable link generation when the
USERFILE
element is present (this example assumes that your
top-level element is called DICTIONARY
):
<xsl:when test="/DICTIONARY/DIR/USERFILE"> <xsl:apply-templates/> </xsl:when>
If your XSLT file descends the hierarchy of XML elements in the most
straightforward way (by implicitly or explicitly having the top level
node do <xsl:apply-templates/>
), then you will
probably want to suppress printing of the contents of the
DIR
and GLOSS
elements in your
HTML output (since, if you just descend the tree, by default element
contents are printed). You can do this with an empty transformation
body like this (assuming that /DICTIONARY/ENTRY
is your
DICTIONARY_ENTRY_XPATH
):
<xsl:template match="DIR|GLOSS"> </xsl:template>
Alternatively, you can avoid having them printed by having the top level XSLT transformation only apply to your entry nodes like this:
<xsl:template match="/"> <HTML> <BODY> <xsl:apply-templates select="DICTIONARY/ENTRY"/> </BODY> </HTML> </xsl:template>
Here is a simple XSLT file that will work with the
tinydict.xml
file introduced earlier:
tinydict.xsl
. (It should load
in your web browser. Finally, this link contains a subset of
tinydict.xml
with instructions to render using
tinydict.xsl
. As a result, if using a modern web browser,
you should see a formatted dictionary entry rendered by the XSLT:
tinyrend.xml
.
For more complex and complete examples, look at some of the XSLT stylesheets that come with the Kirrkirr download. The MiniWrl ones are particularly complex. That is, they're not a terribly good place to start, but they do illustrate many of the things that you can do in XSLT.
It can take quite a while getting an XSLT file both syntactically correct and doing what you want. Again, a web browser is commonly the best way to test an XSLT file, but you need to make sure that it is a modern browswer that supports standard XSLT 1.0 (good candidates are Internet Explorer version 6+, Mozilla, or Netsacpe 7+). Older versions of IE supported a very different non-standard dialect of XSLT. Kirrkirr uses Xalan as our XSLT processor, but it should be sufficient just to use standard XSLT 1.0 processing. To have the web browser render an XML file, put a line at the top of it saying which XSL file to use, like this:
<?xml-stylesheet type="text/xsl" href="tinydictxsl.xml"?>
(Note, we usually use a .xsl
extension for our XSLT files,
but web servers may not recognize this extension as an XML file,
and so it may be safer to test things using a .xml
extension, as here - certainly for Mozilla.)
You can also test out XSL transformations from within Kirrkirr. Use
the Tools | XSL transformer
menu option, and browse
to provide appropriate filenames for the XML input, XSL file, and
output file. You will then need to load the output file in a web
browser to examine it.
Once you have one or more XSLT files that you are at least moderately happy with, then you need to tell Kirrkirr about them. This is one part of defining a DictionaryInfo file for your dictionary, which we turn to next.
Proceed to Building a DictionaryInfo XML file
http://nlp.stanford.edu/kirrkirr/dictionaries