public class IBMArabicEscaper extends Object implements Function<List<HasWord>,List<HasWord>>
LexicalizedParser.
It performs these functions functions:
ArabicTreeNormalizer
Function<List<HasWord>, List<HasWord>>
in order to run with the parser.| Constructor and Description |
|---|
IBMArabicEscaper() |
IBMArabicEscaper(boolean annoteAndClassOnly) |
| Modifier and Type | Method and Description |
|---|---|
List<HasWord> |
apply(List<HasWord> sentence)
Converts an input list of
HasWord in IBM Arabic to
LDC ATBv3 representation. |
String |
apply(String w)
Applies escaping to a single word.
|
void |
disableWarnings()
Disable warnings generated when tokens are escaped.
|
static void |
main(String[] args)
This main method preprocesses one-sentence-per-line input, making the
same changes as the Function.
|
public IBMArabicEscaper()
public IBMArabicEscaper(boolean annoteAndClassOnly)
public void disableWarnings()
public List<HasWord> apply(List<HasWord> sentence)
HasWord in IBM Arabic to
LDC ATBv3 representation. The method safely copies the input object
prior to escaping.public String apply(String w)
w - The wordRuntimeException - If a word is nullified (which is really bad for the parser and
for MT)public static void main(String[] args) throws IOException
.sent appended to their names. If you give the flag
-f then output is instead sent to stdout. Input and output
is always in UTF-8.args - A list of filenames. The files must be UTF-8 encoded.IOException - If there are any issues