various software

The following software may be useful to you. Some are not documented or tested as well as it should be, but that's academic software for you. Most code is released under the GPLv2 license. Almost all code is in Python.

These days, most of my actively maintained code is on GitHub and BitBucket.

Actively maintained

  • Stanford CoreNLP: Stanford's annotation pipeline. I'm one of the many, many members of this project (no longer one of the active maintainers, though).
  • BLLIP reranking parser: I maintain the BLLIP (Charniak-Johnson) reranking parser and have added extensive Python bindings.
  • waterworks: My Python utility library (everyone else has one...) [old page, PyPI page]
  • quickscripts: Simple Python command line tools for various tasks.

Runnable

Bitrot warning

These are here for historical purposes. I've tried to move as much functionality as possible over to BLLIP Parser.

  • Parsing: Python module with parsing related functions (running/training the Charniak parser, tree reading, evaluation). Many of these functions can now be found in the Python bindings for BLLIP Parser.
  • PyInputTree: Original Python interface to the InputTree structure from the Charniak parser via SWIG. This lets you traverse and view Treebank-style trees. Unfortunately, this code has bitrotten and may not compile on modern systems. However, BLLIP Parser's Python bindings include a Tree class which includes nearly all the functionality and more. You may also want to try NLTK which provides a pure Python solution for reading and manipulating Treebank-style trees..
  • parsedyff: Visualize the differences between two treebank parse trees via graphviz.