Web Crawlers

Sort: popular | newest
Open Source Web Crawler for Java
Last Release on Mar 26, 2018
crawler-commons is a set of reusable Java components that implement functionality common to any web crawler.
Last Release on Oct 10, 2014
A java crawler for information collection
Last Release on Sep 7, 2017
Easy to use lightweight web crawler
Last Release on Jul 4, 2020
Apache Nutch
Last Release on Apr 25, 2024
Crawling Ajax applications through dynamic analysis and reconstruction of the UI state changes. Crawljax is based on a method which dynamically builds a `state-flow graph' modeling the various navigation paths and states within an Ajax application.
Last Release on Dec 31, 2012

Relocated → com.crawljax » crawljax-core
BUbiNG is an open-source Java fully distributed crawler
Last Release on Apr 19, 2019
Charles is a smart web crawling library.
Last Release on Jan 29, 2017
Simple java (1.6) crawler to crawl web pages on one and same domain.
Last Release on Feb 8, 2014

10. Alida

alida » alidaApache

Crawling, scraping and indexing application.
Last Release on Aug 7, 2012