Description : Extension componant for calculate similarity score between two strings , A score of 0.0 means that the two strings are absolutely dissimilar, and 1.0 means that absolutely similar (or equal). Anything in between indicates how similar each the two strings are.
All the blocks :
Instructions of extension :
feature : String
target : String
algorithm : property
###Algorithms
1.JaroSimilarity
2.JaroWinklerSimilarity
3.LevenshteinDistance
4.DiceCoefficient
Return score
A score of 0.0 means that the two strings are absolutely dissimilar, and 1.0 means that absolutely similar (or equal). Anything in between indicates how similar each the two strings are.
Algorithms (property)
Pages for logged out editors learn more
In computer science and statistics, the Jaro–Winkler similarity is a string metric measuring an edit distance between two sequences. It is a variant of the Jaro distance metric metric (1989, Matthew A. Jaro) proposed in 1990 by William E. Winkler.
The Jaro–Winkler distance uses a prefix scale
p
{\displaystyle p}
which gives more favourable ratings to strings that match from the beginning for a set prefi...
Pages for logged out editors learn more
In computer science and statistics, the Jaro–Winkler similarity is a string metric measuring an edit distance between two sequences. It is a variant of the Jaro distance metric metric (1989, Matthew A. Jaro) proposed in 1990 by William E. Winkler.
The Jaro–Winkler distance uses a prefix scale
p
{\displaystyle p}
which gives more favourable ratings to strings that match from the beginning for a set prefi...
Pages for logged out editors learn more
In information theory, linguistics, and computer science, the Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. It is named after the Soviet mathematician Vladimir Levenshtein, who considered this distance in 1965.
Leven...
Pages for logged out editors learn more
The Sørensen–Dice coefficient (see below for other names) is a statistic used to gauge the similarity of two samples. It was independently developed by the botanists Thorvald Sørensen and Lee Raymond Dice, who published in 1948 and 1945 respectively.
The index is known by several other names, especially Sørensen–Dice index, Sørensen index and Dice's coefficient. Other variations include the "similarity coefficient" or "index", such as Dice simila...
Error return and source
DEMO BLOCKS :
Download AIX :
com.aemo.similarity.aix (13.7 KB)
Download AIA :
similarity.aia (15.9 KB)
1 Like
Nice extension @MahmoudHussien
1 Like