Content extraction simplified! Retrieve text, data and metadata from binary documents using Tika and similar toolkits

LicenseApache 2.0
HomePage https://opensextant.github.io/XText
DateMar 08, 2024
Filespom (16 KB)  jar (111 KB)  View All
RepositoriesCentral
Ranking#679371 in MvnRepository (See Top Artifacts)

Note: There is a new version for this artifact

New Version3.7.1


Compile Dependencies (28)

Category/License Group / ArtifactVersionUpdates
Logging
EPL 1.0LGPL 2.1
ch.qos.logback » logback-classic 1.4.141.5.12
I18N Lib
com.ibm.icu » icu4j 71.176.1

Apache 2.0
com.pff » java-libpst 0.9.3
Mail Client
EDL 1.0EPL 2.0GPL
com.sun.mail » javax.mail 1.6.22.0.1
Base64
Apache 2.0
commons-codec » commons-codec 1.16.01.17.1
I/O
Apache 2.0
commons-io » commons-io 2.15.02.18.0
Java Spec
EDL 1.0
javax.activation » activation 1.1.12.1.3
Date/Time
Apache 2.0
joda-time » joda-time 2.12.52.13.0
HTML Parser
ApacheEPL 1.0LGPL
net.htmlparser.jericho » jericho-html 3.4
Core Utils
Apache 2.0
org.apache.commons » commons-lang3 3.12.03.17.0
String Utils
Apache 2.0
org.apache.commons » commons-text 1.10.01.12.0
Compression
Apache 2.0
org.apache.commons » commons-compress 1.26.01.27.1
HTTP Clients
Apache 2.0
org.apache.httpcomponents » httpclient 4.5.135.4.1
HTTP Clients
Apache 2.0
org.apache.httpcomponents » httpcore 4.4.165.3.1

Apache 2.0
org.apache.tika » tika-core 2.9.13.0.0

Apache 2.0
org.apache.tika » tika-parser-html-module 2.9.13.0.0

Apache 2.0
org.apache.tika » tika-parser-ocr-module 2.9.13.0.0

Apache 2.0
org.apache.tika » tika-parser-mail-commons 2.9.13.0.0

Apache 2.0
org.apache.tika » tika-parser-mail-module 2.9.13.0.0

Apache 2.0
org.apache.tika » tika-parser-html-commons 2.9.12.9.2

Apache 2.0
org.apache.tika » tika-parser-microsoft-module 2.9.13.0.0

Apache 2.0
org.apache.tika » tika-parser-miscoffice-module 2.9.13.0.0

Apache 2.0
org.apache.tika » tika-parser-pdf-module 2.9.13.0.0

Apache 2.0
org.apache.tika » tika-parser-image-module 2.9.13.0.0

BSD 2-clause
org.jodd » jodd-json 5.1.56.0.3

Apache 2.0
org.opensextant » opensextant-xponents-core 3.7.03.7.3
Logging
MIT
org.slf4j » slf4j-api 2.0.122.0.16
XML Processing
ApacheW3C
xml-apis » xml-apis 1.4.012.0.2

Test Dependencies (2)

Category/License Group / ArtifactVersionUpdates
CLI Parser
Apache 2.0
commons-cli » commons-cli 1.41.9.0
Testing
EPL 2.0
junit » junit 4.13.15.11.3

Managed Dependencies (3)

Category/License Group / ArtifactVersionUpdates
Logging
EPL 1.0LGPL 2.1
ch.qos.logback » logback-classic 1.4.141.5.12
Testing
EPL 2.0
junit » junit 4.13.15.11.3
Logging
MIT
org.slf4j » slf4j-api 2.0.122.0.16

Developers

NameEmailDev IdRolesOrganization
Marc Ubaldinoubaldino<at>mitre.orgLeadMITRE