edu.stanford.nlp.trees.tregex.tsurgeon
Class Tsurgeon

java.lang.Object
  extended by edu.stanford.nlp.trees.tregex.tsurgeon.Tsurgeon

public class Tsurgeon
extends Object

A simple example from the command-line:

java edu.stanford.nlp.trees.tregex.tsurgeon.Tsurgeon -treeFile atree exciseNP renameVerb

Tsurgeon uses the tregex engine to match tree patterns on trees; for more information on tregex's tree-matching functionality, syntax, and semantics, please see the documentation for the TregexPattern class.

If you want to use Tsurgeon as an API, the relevant method is processPattern(edu.stanford.nlp.trees.tregex.TregexPattern, edu.stanford.nlp.trees.tregex.tsurgeon.TsurgeonPattern, edu.stanford.nlp.trees.Tree). You will also need to look at the TsurgeonPattern class and the parseOperation(java.lang.String) method.

Here is a sample invocation:

 TregexPattern matchPattern = TregexPattern.compile("SQ=sq < (/^WH/ $++ VP)");
 List ps = new ArrayList();
 
 TsurgeonPattern p = Tsurgeon.parseOperation("relabel sq S");
 
 ps.add(p);
 
 Collection result = Tsurgeon.processPatternOnTrees(matchPattern,Tsurgeon.collectOperations(ps),lTrees);
 

For more information on using it from the command line, see the main(java.lang.String[]) method.

Author:
Roger Levy

Constructor Summary
Tsurgeon()
           
 
Method Summary
static TsurgeonPattern collectOperations(List<TsurgeonPattern> patterns)
          Collects a list of operation patterns into a sequence of operations to be applied.
static Pair<TregexPattern,TsurgeonPattern> getOperationFromFile(String arg)
           
static void main(String[] args)
          Arguments:
static TsurgeonPattern parseOperation(String operationString)
          Parses an operation string into a TsurgeonPattern.
static Tree processPattern(TregexPattern matchPattern, TsurgeonPattern p, Tree t)
          Tries to match a pattern against a tree.
static Collection<Tree> processPatternOnTrees(TregexPattern matchPattern, TsurgeonPattern p, Collection<Tree> inputTrees)
          Applies {#processPattern} to a collection of trees.
static Tree processPatternsOnTree(List<Pair<TregexPattern,TsurgeonPattern>> ops, Tree t)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Tsurgeon

public Tsurgeon()
Method Detail

main

public static void main(String[] args)
                 throws Exception

Arguments:

Each argument should be the name of a transformation file that contains a TregexPattern pattern on the first line, then a blank line, then a list of transformation operations (as specified by Legal operation syntax below) to apply when the pattern is matched. Note the bit about the blank line: currently the code crashes if it isn't present! For example, if you want to excise an SBARQ node whenever it is the parent of an SQ node, and rename the SQ node to S, your transformation file would look like this:
SBARQ=n1 < SQ=n2

excise n1 n1 rename n2 S

Options:

    -treeFile <filename> specify the name of the file that has the trees you want to transform. -po <matchPattern> <operation> Apply a single operation to every tree using the specified match pattern and the specified operation. Use this option when you want to quickly try the effect of one pattern/surgery combination, and are too lazy to write a transformation file. -s Print each output tree on one line (default is pretty-printing). -m For every tree that had a matching pattern, print "before" (prepended as "Operated on:") and "after" (prepended as "Result:"). Unoperated trees just pass through the transducer as usual. -encoding X Uses character set X for input and output of trees.

Legal operation syntax:

  • delete <name> deletes the node and everything below it.
  • prune <name> Like delete, but if, after the pruning, the parent has no children anymore, the parent is pruned too.
  • excise <name1> <name2> The name1 node should either dominate or be the same as the name2 node. This excises out everything from name1 to name2. All the children of name2 go into the parent of name1, where name1 was.
  • relabel <name> <new-label> relabels the node to have the new label.
  • insert <name> <position> inserts the named node into the position specified.
  • move <name> <position> moves the named node into the specified position

    Right now the only ways to specify position are:

    $+ <name> the left sister of the named node
    $- <name> the right sister of the named node
    >i the i_th daughter of the named node
    >-i the i_th daughter, counting from the right, of the named node.

  • replace <name1> <name2> deletes name1 and inserts a copy of name2 in its place.
  • adjoin <auxiliary_tree> <name> Adjoins the specified auxiliary tree into the named node. The daughters of the target node will become the daughters of the foot of the auxiliary tree.

Parameters:
args - a list of names of files each of which contains a single tregex matching pattern plus a list, one per line, of transformation operations to apply to the matched pattern.
Throws:
Exception

getOperationFromFile

public static Pair<TregexPattern,TsurgeonPattern> getOperationFromFile(String arg)
                                                                throws IOException
Throws:
IOException

processPatternOnTrees

public static Collection<Tree> processPatternOnTrees(TregexPattern matchPattern,
                                                     TsurgeonPattern p,
                                                     Collection<Tree> inputTrees)
Applies {#processPattern} to a collection of trees.

Parameters:
matchPattern - A TregexPattern to be matched against a Tree.
p - A TsurgeonPattern to apply.
inputTrees -
Returns:

processPattern

public static Tree processPattern(TregexPattern matchPattern,
                                  TsurgeonPattern p,
                                  Tree t)
Tries to match a pattern against a tree. If it succeeds, apply the surgical operations contained in a TsurgeonPattern.

Parameters:
matchPattern - A TregexPattern to be matched against a Tree.
p - A TsurgeonPattern to apply.
t - the Tree to match against and perform surgery on.
Returns:
t, which has been surgically modified.

processPatternsOnTree

public static Tree processPatternsOnTree(List<Pair<TregexPattern,TsurgeonPattern>> ops,
                                         Tree t)

parseOperation

public static TsurgeonPattern parseOperation(String operationString)
Parses an operation string into a TsurgeonPattern. Throws an IllegalArgumentException if the operation string is ill-formed.

Parameters:
operationString -
Returns:
the operation pattern.

collectOperations

public static TsurgeonPattern collectOperations(List<TsurgeonPattern> patterns)
Collects a list of operation patterns into a sequence of operations to be applied. Required to keep track of global properties across a sequence of operations. For example, if you want to insert a named node and then coindex it with another node, you will need to collect the insertion and coindexation operations into a single TsurgeonPattern so that tsurgeon is aware of the name of the new node and coindexation becomes possible.

Parameters:
patterns - a list of TsurgeonPattern operations that you want to collect together into a single compound operation
Returns:
a new TsurgeonPattern that performs all the operations in the sequence of the patterns argument


Stanford NLP Group