Class AlphabetConverter

java.lang.Object
org.apache.commons.text.AlphabetConverter

public final class AlphabetConverter extends Object

Convert from one alphabet to another, with the possibility of leaving certain characters unencoded.

The target and 'do not encode' languages must be in the Unicode BMP, but the source language does not.

The encoding will all be of a fixed length, except for the 'do not encode' chars, which will be of length 1

Sample usage

Character[] originals;   // a, b, c, d
Character[] encoding;    // 0, 1, d
Character[] doNotEncode; // d

AlphabetConverter ac = AlphabetConverter.createConverterFromChars(originals,
encoding, doNotEncode);

ac.encode("a");    // 00
ac.encode("b");    // 01
ac.encode("c");    // 0d
ac.encode("d");    // d
ac.encode("abcd"); // 00010dd

#ThreadSafe# AlphabetConverter class methods are thread-safe as they do not change internal state.

Since:
1.0
  • Field Details

    • ARROW

      private static final String ARROW
      Arrow constant, used for converting the object into a string.
      See Also:
    • originalToEncoded

      private final Map<Integer,String> originalToEncoded
      Original string to be encoded.
    • encodedToOriginal

      private final Map<String,String> encodedToOriginal
      Encoding alphabet.
    • encodedLetterLength

      private final int encodedLetterLength
      Length of the encoded letter.
  • Constructor Details

    • AlphabetConverter

      private AlphabetConverter(Map<Integer,String> originalToEncoded, Map<String,String> encodedToOriginal, int encodedLetterLength)
      Hidden constructor for alphabet converter. Used by static helper methods.
      Parameters:
      originalToEncoded - original string to be encoded.
      encodedToOriginal - encoding alphabet.
      encodedLetterLength - length of the encoded letter.
  • Method Details

    • codePointToString

      private static String codePointToString(int i)
      Creates new String that contains just the given code point.
      Parameters:
      i - code point.
      Returns:
      a new string with the new code point.
      See Also:
      • "http://www.oracle.com/us/technologies/java/supplementary-142654.html"
    • convertCharsToIntegers

      private static Integer[] convertCharsToIntegers(Character[] chars)
      Converts characters to integers.
      Parameters:
      chars - array of characters.
      Returns:
      an equivalent array of integers.
    • createConverter

      public static AlphabetConverter createConverter(Integer[] original, Integer[] encoding, Integer[] doNotEncode)
      Creates an alphabet converter, for converting from the original alphabet, to the encoded alphabet, while leaving the characters in doNotEncode as they are (if possible).

      Duplicate letters in either original or encoding will be ignored.

      Parameters:
      original - an array of ints representing the original alphabet in code points.
      encoding - an array of ints representing the alphabet to be used for encoding, in code points.
      doNotEncode - an array of ints representing the chars to be encoded using the original alphabet - every char here must appear in both the previous params.
      Returns:
      The AlphabetConverter.
      Throws:
      IllegalArgumentException - if an AlphabetConverter cannot be constructed.
    • createConverterFromChars

      public static AlphabetConverter createConverterFromChars(Character[] original, Character[] encoding, Character[] doNotEncode)
      Creates an alphabet converter, for converting from the original alphabet, to the encoded alphabet, while leaving the characters in doNotEncode as they are (if possible).

      Duplicate letters in either original or encoding will be ignored.

      Parameters:
      original - an array of chars representing the original alphabet
      encoding - an array of chars representing the alphabet to be used for encoding
      doNotEncode - an array of chars to be encoded using the original alphabet - every char here must appear in both the previous params
      Returns:
      The AlphabetConverter
      Throws:
      IllegalArgumentException - if an AlphabetConverter cannot be constructed
    • createConverterFromMap

      public static AlphabetConverter createConverterFromMap(Map<Integer,String> originalToEncoded)
      Creates a new converter from a map.
      Parameters:
      originalToEncoded - a map returned from getOriginalToEncoded().
      Returns:
      The reconstructed AlphabetConverter.
      See Also:
    • addSingleEncoding

      private void addSingleEncoding(int level, String currentEncoding, Collection<Integer> encoding, Iterator<Integer> originals, Map<Integer,String> doNotEncodeMap)
      Recursive method used when creating encoder/decoder.
      Parameters:
      level - at which point it should add a single encoding.
      currentEncoding - current encoding.
      encoding - letters encoding.
      originals - original values.
      doNotEncodeMap - map of values that should not be encoded.
    • decode

      public String decode(String encoded) throws UnsupportedEncodingException
      Decodes a given string.
      Parameters:
      encoded - a string that has been encoded using this AlphabetConverter.
      Returns:
      The decoded string, null if the given string is null.
      Throws:
      UnsupportedEncodingException - if unexpected characters that cannot be handled are encountered.
    • encode

      public String encode(String original) throws UnsupportedEncodingException
      Encodes a given string.
      Parameters:
      original - the string to be encoded.
      Returns:
      The encoded string, null if the given string is null.
      Throws:
      UnsupportedEncodingException - if chars that are not supported are encountered.
    • equals

      public boolean equals(Object obj)
      Overrides:
      equals in class Object
    • getEncodedCharLength

      public int getEncodedCharLength()
      Gets the length of characters in the encoded alphabet that are necessary for each character in the original alphabet.
      Returns:
      The length of the encoded char.
    • getOriginalToEncoded

      public Map<Integer,String> getOriginalToEncoded()
      Gets the mapping from integer code point of source language to encoded string. Use to reconstruct converter from serialized map.
      Returns:
      The original map.
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • toString

      public String toString()
      Overrides:
      toString in class Object