Advanced topics (fonts, splashscreen, ...)

Fonts and Character Sets

Internally, Kirrkirr, as a Java program, uses Unicode, and so, in theory, you should be able to work with characters from any language that is defined in Unicode. In practice, things can get more complicated and messier than that.

Files

Kirrkirr writes intermediate files in UTF-8, which should work with any Unicode characters. Your dictionary need not be in UTF-8, but it should correctly identify its character set in the XML header.

Fonts

This is where most of the problems arise in practice. Java needs to be able to find a font that can display your characters. Depending on what fonts are available on your system, and whether Java can find them (with or without hints from Kirrkirr), you may or may not see the correct characters. If Java cannot find correct characters to display, you standardly see rectangular boxes.

Is there an appropriate font on my system?

Kirrkirr doesn't ship with fonts for miscellaneous alphabets. There needs to be something appropriate already installed on your system. (If you have a recent operating system (e.g., Windows XP or Mac OS X), then you do already have fonts that can display most of the major character sets (e.g., Arabic and Chinese), but probably not Inuktitut syllabics.

Can Java find my fonts?

If your fonts are in the standard font location for your system, then the answer should be yes, but for some systems, some fiddling with font.properties could be needed.

Will Kirrkirr/Java know the right font to use?

It may well not do this automatically, if you are running in an English locale using the default font.properties. There are four ways that you can attempt to fix this:

Run Kirrkirr with Java 1.5 or later. When run with recent versions of Java, the JDK seems to automatically find and select the right fonts (if present) and you don't need to do anything further. Unfortunately, the installers on this site that include a JVM still currently include a Java 1.4 JVM, so you'll need to download and install a more recent JVM by yourself, and then install the version of Kirrkirr which doesn't include a JVM.
Explicitly give information to Kirrkirr about which fonts to use. Kirrkirr knows about three properties that it will use as font names:
- kirrkirr.wordFont - This font is used to display words in lists, networks, etc.
- kirrkirr.textFont - This font is used to display running text.
- kirrkirr.interfaceFont - This font is used for some interface elements, where the font is explicitly specified in Kirrkirr code. However, many other interface elements just use whatever font is provided as the default by your implementation's LookAndFeel.
You need to set these properties by hand by editing the kirrkirr.properties file in the directory in which Kirrkirr was installed (in a future version we'll probably make setting these available as a Preferences tab). Their value should be the name of a font on your system, for example:
kirrkirr.wordFont=Arial Unicode MS
Change your locale: if your locale/operating system is set to Chinese, or Japanese, or whatever is appropriate for the fonts that you want to be able to view, then things are much more likely to work, since Java will then use the correct font.properties.locale file.
Hand edit the font.properties file (which you find in the jre/lib directory) to tell it about how to use fonts on your system to display certain Unicode character ranges. Doing this correctly is quite tricky; see Sun's documentation.

Getting everything right on a pre-JDK1.5 system may require doing 2 and (3 or 4). In particular, at present, I believe the HtmlPanel will not display characters correctly unless they can be found at the system level by either method 3 or 4. Nevertheles, it can be done. For example, here is Kirrkirr displaying a demonstration tiny Japanese dictionary.

Localization

Kirrkirr fully supports localization of its interface. However, currently very little localization data is available. In fact, all that is available is an Australian English localization (which is used by default), and a very partial Warlpiri localization. Additional localizations can be defined by providing appropriate lang_langcode_country.properties files.

Spash screen

The picture that Kirrkirr displays at startup is loaded from icons/splash.jpg in the installation directory. For speed of startup reasons, this path is hardcoded and cannot be changed. However, you can change the picture stored as that filename to any suitable sized JPEG file.

Known problems

Kirrkirr has some known issues, which we hope to address in future versions. Among others these include:

On Mac OS X, the ovals for semantic domains appear minute rather than as large ovals.
The crossword game may not work with non-latin alphabets.
Kirrkirr doesn't work correctly with XML dictionaries encoded in encodings other than UTF-8 and US-ASCII.

Treatment of subentries

Many dictionaries contain subentries: headwords for derived forms, or combinations of a noun and a light verb, or whatever are placed under the headword for the main entry, together with information about this subheadword. This organization doesn't work very well with Kirrkirr. While one can display a dictionary in this form (with a suitable XSL file), only the main headwords will be used, in the Network display, or when searching by headwords, etc.

This is largely by design: our paper dictionary usability testing showed people generally being confused by and not properly interpreting subentries. Further, we were interested in network representations of the lexicon, and this led to the idea that one is better off regarding all words as headwords, and turning this subordination relationship into a pair of links (mainentry for subentry, and subentries of main entry), which parallel other link types like synonym, antonym or hyponym. The included MiniWrl dictionary gives an example of this. (Note that the entries for main head words should also include information about what their subentries are. This is necessary for subentries to be displayed in the Network pane when a main entry is clicked on: the program only looks locally at one entry for links from it.)

If an XML dictionary has subentries in the traditional way, it should be fairly straightforward to automatically convert it, by promoting the subentries into the list of main headwords, perhaps marking them with an attribute, and simultaneously putting in links from the subentry to the main entry and from the main entry to the subentry. We hope to someday write a general utility that will do this, but don't have one at present.

Proceed to Troubleshooting problems.

Back to Preparing a dictionary for Kirrkirr.

Back to the Kirrkirr home page.

http://nlp.stanford.edu/kirrkirr/dictionaries/other.html

Christopher Manning -- <manning@cs.stanford.edu> -- Last modified: Fri Jul 21 07:03:28 PDT 2006