Class OrthographicRules
java.lang.Object
|
+--OrthographicRules
- public class OrthographicRules
- extends java.lang.Object
This class provides pluggable rules for orthographic normalization, with
offset tracking.
- Version:
- $2007-04-30 03:30:17 mdh$
- Author:
- Malcolm D. Hyman
|
Method Summary |
int[] |
getOffsetTable()
Returns the offset table. |
static void |
main(java.lang.String[] argv)
We provide main() so that our services will be available
outside Java (i.e., so we can run as a Un*x-style filter). |
java.lang.String |
normalize(java.lang.String s)
Applies the normalization rules in ruleset to
s, without offset tracking. |
java.lang.String |
normalize(java.lang.String s,
int[] offsets)
Applies the normalization rules in ruleset to
s, with offset tracking. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
offsets
public int[] offsets
IT_VOWELS
public static final java.lang.String IT_VOWELS
IT_CONS
public static final java.lang.String IT_CONS
OrthographicRules
public OrthographicRules(java.lang.String ruleset)
- Constructor.
- Parameters:
ruleset - name of rule set to apply
normalize
public java.lang.String normalize(java.lang.String s,
int[] offsets)
- Applies the normalization rules in
ruleset to
s, with offset tracking.
WARNING:
Arboreal will not work properly if a normalization substitution
replaces a source character with more than two target characters!
This is simply a BUG, and should be fixed. Fortunately, however,
one does not often need such a replacement.
FIXME: If the orthographic rules eliminate all characters in a
word, word counting in ContentRenderPane will not work correctly!
- Parameters:
s - source stringoffsets - character offset table- Returns:
- normalized string
normalize
public java.lang.String normalize(java.lang.String s)
- Applies the normalization rules in
ruleset to
s, without offset tracking.
- Parameters:
s - source string- Returns:
- normalized string
getOffsetTable
public int[] getOffsetTable()
- Returns the offset table.
- Returns:
- offset table
main
public static void main(java.lang.String[] argv)
- We provide
main() so that our services will be available
outside Java (i.e., so we can run as a Un*x-style filter).