Python project to crawl and scrap the lesser known deep web or one can say dark web. Just provide the onion link and get started.
A basic scrapper made in python with BeautifulSoup and Tor support to -
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
[sudo] apt-get install python3 python3-dev
You can install Tor by going to their website - https://www.torproject.org/
Furthermore install the requirements.txt using pip3 -
[sudo] pip3 install -r requirements.txt
TL;DR: We recommend installing TorScrapper inside a virtual environment on all platforms.
Python packages can be installed either globally (a.k.a system wide), or in user-space. We do not recommend installing TorScrapper system wide.
Instead, we recommend that you install TorScrapper within a so-called “virtual environment” (virtualenv). Virtualenvs allow you to not conflict with already-installed Python system packages (which could break some of your system tools and scripts), and still install packages normally with pip (without sudo and the likes).
To get started with virtual environments, see virtualenv installation instructions. To install it globally (having it globally installed actually helps here), it should be a matter of running:
[sudo] pip install virtualenv
Before you run the torBot make sure the following things are done properly:
Run tor service
sudo service tor start
Set a password for tor
tor --hash-password "my_password"
Give the password inside /Modules/Scrape.py
from stem.control import Controller with Controller.from_port(port = 9051) as controller: controller.authenticate("your_password_hash") controller.signal(Signal.NEWNYM)
Go to /etc/tor/torrc and uncomment - ControlPort 9051
Read more about torrc here : Torrc
A step by step series of examples that tells what you have to do to get this project running -
[nano]/[vim]/[gedit]/[Your choice of editor] onions.txt
[sudo] python3 TorScrapper.py
If you have new ideas which is worth implementing, mention those by starting a new issue with the title [FEATURE_REQUEST]. If the idea is worth implementing, congratz you are now a contributor.