Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.

LicenseMIT
Tagsrlangcran
HomePage https://github.com/ropensci/textreuse 🔍 Inspect URL
Ranking#350918 in MvnRepository (See Top Artifacts)
Used By1 artifacts

VersionVulnerabilitiesRepositoryUsagesDate
0.1.x
0.1.4-b15BeDataDrivenMay 29, 2022
0.1.4-b14BeDataDriven
0
May 29, 2022
0.1.4-b13BeDataDriven
0
May 29, 2022
0.1.4-b12BeDataDriven
0
May 29, 2022
0.1.4-b11BeDataDriven
0
May 29, 2022
0.1.4-b10BeDataDriven
0
May 29, 2022
0.1.4-b9BeDataDriven
0
May 29, 2022
0.1.4-b8BeDataDriven
0
May 29, 2022
0.1.4-b7BeDataDriven
0
May 29, 2022
0.1.4-b6BeDataDriven
0
May 29, 2022
0.1.4-b4BeDataDriven
0
May 29, 2022