It offers functions for splitting, parsing, tokenizing and creating a vocabulary for big text data files. Moreover, it includes functions for building a document-term matrix and extracting information from those (term-associations, most frequent terms). It also embodies functions for calculating token statistics (collocations, look-up tables, string dissimilarities) and functions to work with sparse matrices. Lastly, it includes functions for Word Vector Representations (i.e. 'GloVe', 'fasttext') and ...

LicenseGPL 3.0
CategoriesWord Embedding
Tagsnlpembeddingtextrlangcranaiword
HomePage https://github.com/mlampros/textTinyR 🔍 Inspect URL
DateMay 12, 2022
Filespom (6 KB)  jar (285 KB)  View All
RepositoriesBeDataDriven
Ranking#726929 in MvnRepository (See Top Artifacts)
#14 in Word Embedding

Note: There is a new version for this artifact

New Version1.1.0-b2

Scope:
Scope:
Format:
Scope:
Scope:
Scope:
Scope:
Scope:
Scope:

Note: this artifact is located at BeDataDriven repository (https://nexus.bedatadriven.com/content/groups/public/)

Provided Dependencies (4)

Test Dependencies (1)

Category/License Group / ArtifactVersionUpdates

MIT
org.renjin.cran » testthat 1.0.2-renjin-141.0.2-renjin-17

Licenses

LicenseURL
GPL-3

Developers

NameEmailDev IdRolesOrganization
Lampros Mouselimismouselimislampros<at>gmail.com