Content extraction simplified! Retrieve text, data and metadata from binary documents using Tika and similar toolkits

LicenseApache 2.0
HomePage https://opensextant.github.io/XText
DateJun 23, 2023
Filespom (16 KB)  jar (108 KB)  View All
RepositoriesCentral
Ranking#673739 in MvnRepository (See Top Artifacts)
VulnerabilitiesVulnerabilities from dependencies:
CVE-2024-47554
CVE-2024-26308
CVE-2024-25710
View 1 more ...

Note: There is a new version for this artifact

New Version3.7.1


Compile Dependencies (27)

Category/License Group / ArtifactVersionUpdates
Logging
EPL 1.0LGPL 2.1
ch.qos.logback » logback-classic1 vulnerability 1.4.41.5.12
I18N Lib
com.ibm.icu » icu4j 71.176.1

Apache 2.0
com.pff » java-libpst 0.9.3
Mail Client
EDL 1.0EPL 2.0GPL
com.sun.mail » javax.mail 1.6.22.0.1
Base64
Apache 2.0
commons-codec » commons-codec 1.151.17.1
I/O
Apache 2.0
commons-io » commons-io1 vulnerability 2.11.02.18.0
Java Spec
EDL 1.0
javax.activation » activation 1.12.1.3
Date/Time
Apache 2.0
joda-time » joda-time 2.10.132.13.0
HTML Parser
ApacheEPL 1.0LGPL
net.htmlparser.jericho » jericho-html 3.4
Core Utils
Apache 2.0
org.apache.commons » commons-lang3 3.12.03.17.0
String Utils
Apache 2.0
org.apache.commons » commons-text 1.10.01.12.0
Compression
Apache 2.0
org.apache.commons » commons-compress2 vulnerabilities 1.211.27.1
HTTP Clients
Apache 2.0
org.apache.httpcomponents » httpclient 4.5.135.4.1
HTTP Clients
Apache 2.0
org.apache.httpcomponents » httpcore 4.4.165.3.1

Apache 2.0
org.apache.tika » tika-core 2.8.03.0.0

Apache 2.0
org.apache.tika » tika-parser-html-module 2.8.03.0.0

Apache 2.0
org.apache.tika » tika-parser-mail-commons 2.8.03.0.0

Apache 2.0
org.apache.tika » tika-parser-mail-module 2.8.03.0.0

Apache 2.0
org.apache.tika » tika-parser-html-commons 2.8.02.9.2

Apache 2.0
org.apache.tika » tika-parser-microsoft-module 2.8.03.0.0

Apache 2.0
org.apache.tika » tika-parser-miscoffice-module 2.8.03.0.0

Apache 2.0
org.apache.tika » tika-parser-pdf-module 2.8.03.0.0

Apache 2.0
org.apache.tika » tika-parser-image-module 2.8.03.0.0

BSD 2-clause
org.jodd » jodd-json 5.1.56.0.3

Apache 2.0
org.opensextant » opensextant-xponents-core 3.6.23.7.3
Logging
MIT
org.slf4j » slf4j-api 2.0.62.0.16
XML Processing
ApacheW3C
xml-apis » xml-apis 1.4.012.0.2

Test Dependencies (2)

Category/License Group / ArtifactVersionUpdates
CLI Parser
Apache 2.0
commons-cli » commons-cli 1.41.9.0
Testing
EPL 2.0
junit » junit 4.13.15.11.3

Managed Dependencies (3)

Developers

NameEmailDev IdRolesOrganization
Marc Ubaldinoubaldino<at>mitre.orgLeadMITRE