Apache Nutch is an extensible and scalable web crawler
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Viewers for statistics and dashboarding of Domain Search Engine data
A OCR Search Engine With Tesseract Nutch Solr And PHP