Repository

Artifacts/Jars

Popular Tags

admin android apache api beans build build-system bytecode cache client cloud codehaus config console container database directory eclipse ejb esb framework groovy gwt http jboss json logging maven model module net osgi persistence plugin queue scala search security server servlet spring testing twitter ui web web-framework webserver webservice wicket xml

[All Tags]
home » org.ccil.cowan.tagsoup » tagsoup » 1.2

TagSoup

TagSoup is a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. By providing a SAX interface, it allows standard XML tools to be applied to even the worst HTML. TagSoup also includes a command-line processor that reads HTML files and can generate either clean HTML or well-formed XML that is a close approximation to XHTML.

Artifact Download (JAR) (88 KB)
POM File View
Date (Jan 05, 2008)
HomePage http://home.ccil.org/~cowan/XML/tagsoup/
Organization
Issue Tracker




Licenses

License URL
Apache License 2.0 http://www.apache.org/licenses/LICENSE-2.0.txt

Depends on

Group Artifact Version

Used by

Artifact Group
Ofx4j
ofx4j (used by 9 versions)
net.sf.ofx4j
Camel :: Assembly :: Bundle
camel-bundle (used by 5 versions)
org.apache.camel
Camel :: TagSoup
camel-tagsoup (used by 30 versions)
org.apache.camel
Apache ServiceMix :: Bundles :: Drools
org.apache.servicemix.bundles.drools (used by 1 version)
org.apache.servicemix.bundles
Apache Sling Commons HTML Utilities
org.apache.sling.commons.html (used by 1 version)
org.apache.sling
Apache Tika Parsers
tika-parsers (used by 5 versions)
org.apache.tika
BaseX
basex (used by 1 version)
org.basex
ICal4j
ical4j (used by 3 versions)
org.mnode.ical4j
W3C CSS Validator
css-validator (used by 1 version)
org.w3c.css
Doxia HTML Module
doxia-module-html (used by 1 version)
com.randomnoun.maven.doxia
Randomnoun Common Classes
common-public (used by 1 version)
com.randomnoun.common
CMLXOM
cmlxom (used by 1 version)
org.xml-cml
UPortal Source
uportal-impl (used by 18 versions)
org.jasig.portal
LiveSense :: Service :: XSS Remove
org.liveSense.service.xssRemove (used by 2 versions)
com.github.livesense
Easyjasub Lib
easyjasub-lib (used by 5 versions)
com.github.riccardove.easyjasub
Jgenhtml
jgenhtml (used by 1 version)
com.googlecode.jgenhtml

Packages

org.ccil.cowan.tagsoup
org.ccil.cowan.tagsoup.jaxp