Class CosineSimilarity

java.lang.Object
org.apache.commons.text.similarity.CosineSimilarity

public class CosineSimilarity extends Object
Measures the Cosine similarity of two vectors of an inner product space and compares the angle between them.

For further explanation about the Cosine Similarity, refer to https://en.wikipedia.org/wiki/Cosine_similarity.

Instances of this class are immutable and are safe for use by multiple concurrent threads.

Since:
1.0
  • Field Details

  • Constructor Details

    • CosineSimilarity

      public CosineSimilarity()
      Construct a new instance.
  • Method Details

    • cosineSimilarity

      public Double cosineSimilarity(Map<CharSequence, Integer> leftVector, Map<CharSequence, Integer> rightVector)
      Calculates the cosine similarity for two given vectors.
      Parameters:
      leftVector - left vector.
      rightVector - right vector.
      Returns:
      cosine similarity between the two vectors.
    • dot

      private double dot(Map<CharSequence, Integer> leftVector, Map<CharSequence, Integer> rightVector, Set<CharSequence> intersection)
      Computes the dot product of two vectors. It ignores remaining elements. It means that if a vector is longer than other, then a smaller part of it will be used to compute the dot product.
      Parameters:
      leftVector - left vector.
      rightVector - right vector.
      intersection - common elements.
      Returns:
      The dot product.
    • getIntersection

      private Set<CharSequence> getIntersection(Map<CharSequence, Integer> leftVector, Map<CharSequence, Integer> rightVector)
      Returns a set with keys common to the two given maps.
      Parameters:
      leftVector - left vector map.
      rightVector - right vector map.
      Returns:
      common strings.