Class DamerauLevenshteinDistance

java.lang.Object
org.apache.commons.text.similarity.DamerauLevenshteinDistance
All Implemented Interfaces:
BiFunction<CharSequence, CharSequence, Integer>, EditDistance<Integer>, ObjectSimilarityScore<CharSequence, Integer>, SimilarityScore<Integer>

public class DamerauLevenshteinDistance extends Object implements EditDistance<Integer>
An algorithm for measuring the difference between two character sequences using the Damerau-Levenshtein Distance.

This is the number of changes needed to change one sequence into another, where each change is a single character modification (deletion, insertion, substitution, or transposition of two adjacent characters).

Since:
1.15.0
See Also:
  • Field Details

    • threshold

      private final Integer threshold
      Threshold.
  • Constructor Details

    • DamerauLevenshteinDistance

      public DamerauLevenshteinDistance()
      Constructs a default instance that uses a version of the algorithm that does not use a threshold parameter.
    • DamerauLevenshteinDistance

      public DamerauLevenshteinDistance(Integer threshold)
      Constructs a new instance. If the threshold is not null, distance calculations will be limited to a maximum length. If the threshold is null, the unlimited version of the algorithm will be used.
      Parameters:
      threshold - If this is null then distances calculations will not be limited. This may not be negative.
  • Method Details

    • clampDistance

      private static int clampDistance(int distance, int threshold)
      Utility function to ensure distance is valid according to threshold.
      Parameters:
      distance - The distance value.
      threshold - The threshold value.
      Returns:
      The distance value, or -1 if distance is greater than threshold.
    • limitedCompare

      private static <E> int limitedCompare(SimilarityInput<E> left, SimilarityInput<E> right, int threshold)
      Finds the Damerau-Levenshtein distance between two CharSequences if it's less than or equal to a given threshold.
      Parameters:
      left - the first SimilarityInput, must not be null.
      right - the second SimilarityInput, must not be null.
      threshold - the target threshold, must not be negative.
      Returns:
      result distance, or -1 if distance exceeds threshold.
    • unlimitedCompare

      private static <E> int unlimitedCompare(SimilarityInput<E> left, SimilarityInput<E> right)
      Finds the Damerau-Levenshtein distance between two inputs using optimal string alignment.
      Parameters:
      left - the first CharSequence, must not be null.
      right - the second CharSequence, must not be null.
      Returns:
      result distance.
      Throws:
      IllegalArgumentException - if either CharSequence input is null.
    • apply

      public Integer apply(CharSequence left, CharSequence right)
      Computes the Damerau-Levenshtein distance between two Strings.

      A higher score indicates a greater distance.

      Specified by:
      apply in interface BiFunction<CharSequence, CharSequence, Integer>
      Specified by:
      apply in interface ObjectSimilarityScore<CharSequence, Integer>
      Specified by:
      apply in interface SimilarityScore<Integer>
      Parameters:
      left - the first input, must not be null.
      right - the second input, must not be null.
      Returns:
      result distance, or -1 if threshold is exceeded.
      Throws:
      IllegalArgumentException - if either String input null.
    • apply

      public <E> Integer apply(SimilarityInput<E> left, SimilarityInput<E> right)
      Computes the Damerau-Levenshtein distance between two inputs.

      A higher score indicates a greater distance.

      Type Parameters:
      E - The type of similarity score unit.
      Parameters:
      left - the first input, must not be null.
      right - the second input, must not be null.
      Returns:
      result distance, or -1 if threshold is exceeded.
      Throws:
      IllegalArgumentException - if either String input null.
      Since:
      1.13.0
    • getThreshold

      public Integer getThreshold()
      Gets the distance threshold.
      Returns:
      The distance threshold.