Class CosineSimilarity
java.lang.Object
org.apache.commons.text.similarity.CosineSimilarity
Measures the Cosine similarity of two vectors of an inner product space and compares the angle between them.
For further explanation about the Cosine Similarity, refer to https://en.wikipedia.org/wiki/Cosine_similarity.
Instances of this class are immutable and are safe for use by multiple concurrent threads.
- Since:
- 1.0
-
Field Summary
FieldsModifier and TypeFieldDescription(package private) static final CosineSimilarityThe singleton instance. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptioncosineSimilarity(Map<CharSequence, Integer> leftVector, Map<CharSequence, Integer> rightVector) Calculates the cosine similarity for two given vectors.private doubledot(Map<CharSequence, Integer> leftVector, Map<CharSequence, Integer> rightVector, Set<CharSequence> intersection) Computes the dot product of two vectors.private Set<CharSequence> getIntersection(Map<CharSequence, Integer> leftVector, Map<CharSequence, Integer> rightVector) Returns a set with keys common to the two given maps.
-
Field Details
-
INSTANCE
The singleton instance.
-
-
Constructor Details
-
CosineSimilarity
public CosineSimilarity()Construct a new instance.
-
-
Method Details
-
cosineSimilarity
public Double cosineSimilarity(Map<CharSequence, Integer> leftVector, Map<CharSequence, Integer> rightVector) Calculates the cosine similarity for two given vectors.- Parameters:
leftVector- left vector.rightVector- right vector.- Returns:
- cosine similarity between the two vectors.
-
dot
private double dot(Map<CharSequence, Integer> leftVector, Map<CharSequence, Integer> rightVector, Set<CharSequence> intersection) Computes the dot product of two vectors. It ignores remaining elements. It means that if a vector is longer than other, then a smaller part of it will be used to compute the dot product.- Parameters:
leftVector- left vector.rightVector- right vector.intersection- common elements.- Returns:
- The dot product.
-
getIntersection
private Set<CharSequence> getIntersection(Map<CharSequence, Integer> leftVector, Map<CharSequence, Integer> rightVector) Returns a set with keys common to the two given maps.- Parameters:
leftVector- left vector map.rightVector- right vector map.- Returns:
- common strings.
-