>>> from InputTree import InputTree >>> t = InputTree('(S1 (S (NP (N hi))))') >>> t InputTree('(S1 (S (NP (N hi))))') >>> t.head() 'hi' >>> t.term() 'S1' >>> t.subTrees() [InputTree('(S (NP (N hi)))')] >>> c = t.subTrees() >>> c.head(), c.term() ('hi', 'S') >>> c.parent() InputTree('(S1 (S (NP (N hi))))') >>> c.hTag() # head tag 'N' >>> t.getYield() ['hi'] >>> print t (S1 (S (NP (N hi)))) >>> def partofspeech(tree, path, index): ... return tree.term() ... >>> print list(t.walk(partofspeech)) ['S1', 'S', 'NP', 'N'] >>> def head2headtags(tree, path, index): ... if path: ... return (tree.term(), tree.parent().term()) ... else: ... return (tree.term(), None) ... >>> print list(t.walk(head2headtags)) [('S1', None), ('S', 'S1'), ('NP', 'S'), ('N', 'NP')]See the files in test/ for more examples.
Download and build the Charniak parser and edit setup.py to point to the directory containing the PARSE/ directory. Finally, run 'python setup.py install'
Alternatively, I have included a binary of _InputTree.so, which, if your architecture is compatible, will let you skip downloading and building the parser (and building the C extension). In this case, you should copy the InputTree/ directory to your Python install path. If I have time, I will try to make distutils work with this.