A Pythonic wrapper of Eugene Charniak's InputTree structure from his parser using SWIG.


>>> from InputTree import InputTree
>>> t = InputTree('(S1 (S (NP (N hi))))')
>>> t
InputTree('(S1 (S (NP (N hi))))')
>>> t.head()
>>> t.term()
>>> t.subTrees()
[InputTree('(S (NP (N hi)))')]
>>> c = t.subTrees()[0]
>>> c.head(), c.term()
('hi', 'S')
>>> c.parent()
InputTree('(S1 (S (NP (N hi))))')
>>> c.hTag() # head tag
>>> t.getYield()
>>> print t
(S1 (S (NP (N hi))))
>>> def partofspeech(tree, path, index):
...     return tree.term()
>>> print list(t.walk(partofspeech))
['S1', 'S', 'NP', 'N']
>>> def head2headtags(tree, path, index):
...     if path:
...             return (tree.term(), tree.parent().term())
...     else:
...             return (tree.term(), None)
>>> print list(t.walk(head2headtags))
[('S1', None), ('S', 'S1'), ('NP', 'S'), ('N', 'NP')]
See the files in test/ for more examples.


Installing PyInputTree is potentially nontrivial -- apologies in advance.

Download and build the Charniak parser and edit to point to the directory containing the PARSE/ directory. Finally, run 'python install'

Alternatively, I have included a binary of, which, if your architecture is compatible, will let you skip downloading and building the parser (and building the C extension). In this case, you should copy the InputTree/ directory to your Python install path. If I have time, I will try to make distutils work with this.




David McClosky (dmcc+py AT cs DOT brown DOT edu)