jsoup
jsoup is a Java library that simplifies working with real-world HTML and XML. It offers an easy-to-use API for URL fetching, data parsing, extraction, and manipulation using DOM API methods, CSS, and xpath selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers.
Last Release on Mar 4, 2025
5. LangChain4j :: Document Transformer :: Jsoup5 usages
dev.langchain4j » langchain4j-document-transformer-jsoupApache
LangChain4j :: Document Transformer :: Jsoup
Last Release on Nov 21, 2024
jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.
Last Release on Apr 21, 2022
DEPRECATED MODULE FOR REMOVAL: Use Jsoup provided W3CDom helper class instead. Open HTML to PDF is a CSS 2.1 renderer written in Java. This artifact supports converting a Jsoup HTML5 instance into a DOM supported by Open HTML to PDF.
Last Release on Jul 23, 2019