public class Buckwalter extends java.lang.Object implements SerializableFunction<java.lang.String,java.lang.String>
Sources
"MORPHOLOGICAL ANALYSIS & POS ANNOTATION," v3.8. LDC. 08 June 2009. http://www.ldc.upenn.edu/myl/morph/buckwalter.html http://www.qamus.org/transliteration.htm (Tim Buckwalter's site) http://www.livingflowers.com/Arabic_transliteration (many but hard to use) http://www.cis.upenn.edu/~cis639/arabic/info/romanization.html http://www.nongnu.org/aramorph/english/index.html (Java AraMorph) BBN's MBuckWalter2Unicode.tab see also my GALE-NOTES.txt file for other mappings ROSETTA people do. Normalization of decomposed characters to composed: ARABIC LETTER ALEF (ا), ARABIC MADDAH ABOVE (ٓ) -> ARABIC LETTER ALEF WITH MADDA ABOVE ARABIC LETTER ALEF (ا), ARABIC HAMZA ABOVE (ٔ) -> ARABIC LETTER ALEF WITH HAMZA ABOVE (أ) ARABIC LETTER WAW, ARABIC HAMZA ABOVE -> ARABIC LETTER WAW WITH HAMZA ABOVE ARABIC LETTER ALEF, ARABIC HAMZA BELOW (ٕ) -> ARABIC LETTER ALEF WITH HAMZA BELOW ARABIC LETTER YEH, ARABIC HAMZA ABOVE -> ARABIC LETTER YEH WITH HAMZA ABOVE
Constructor and Description |
---|
Buckwalter() |
Buckwalter(boolean unicodeToBuckwalter) |
Modifier and Type | Method and Description |
---|---|
java.lang.String |
apply(java.lang.String in) |
java.lang.String |
buckwalterToUnicode(java.lang.String in) |
static void |
main(java.lang.String[] args) |
void |
suppressBuckDigitConversion(boolean b) |
void |
suppressBuckPunctConversion(boolean b) |
java.lang.String |
unicodeToBuckwalter(java.lang.String in) |
public Buckwalter()
public Buckwalter(boolean unicodeToBuckwalter)
public void suppressBuckDigitConversion(boolean b)
public void suppressBuckPunctConversion(boolean b)
public java.lang.String apply(java.lang.String in)
apply
in interface java.util.function.Function<java.lang.String,java.lang.String>
public java.lang.String buckwalterToUnicode(java.lang.String in)
public java.lang.String unicodeToBuckwalter(java.lang.String in)
public static void main(java.lang.String[] args)
args
-