🕷️ An easy-to-use spider written in Golang. (previous named GOPA.)
GOPA, A Spider Written in Go.
First of all, get it, two opinions: download the pre-built package or compile it yourself.
Go to Release page, download the right package for your platform.
Note: Darwin is for Mac
Requirements
Supported platform
make build
to build the Gopa. For example:
#apt install golang-go
#brew install golang
mkdir ~/go/src/github.com/infinitbyte/ -p
cd ~/go/src/github.com/infinitbyte/
git clone https://github.com/infinitbyte/gopa.git
cd gopa
make
After a few minutes, you should have:
gopa
, the main program, a single binary.
gopa.yml
, main configuration for gopa.
Note: Elasticsearch version should >= v5.3
gopa.yml
, update the elasticsearch's setting:elasticsearch:
- name: default
enabled: true
endpoint: http://localhost:9200
index_prefix: gopa-
basic_auth:
username: elastic
password: changeme
Besides Elasticsearch, Gopa doesn't require any other dependencies, just simply run ./gopa
to start the program.
Gopa can be run as daemon(Note: Only available on Linux and Mac):
[10-21 16:01:09] [INF] [instance.go:23] workspace: data/gopa/nodes/0
[gopa] started.Example
➜ gopa git:(master) ✗ ./bin/gopa --daemon
________ ________ __________ _____
/ _____/ \_____ \\______ \/ _ \
/ \ ___ / | \| ___/ /_\ \
\ \_\ \/ | \ | / | \
\______ /\_______ /____| \____|__ /
\/ \/ \/
[gopa] 0.10.0_SNAPSHOT
///last commit: 99616a2, Fri Oct 20 14:04:54 2017 +0200, medcl, update version to 0.10.0 ///
Also run ./gopa -h
to get the full list of command line options.
Usage of ./bin/gopa:
-config string
the location of config file (default "gopa.yml")
-cpuprofile string
write cpu profile to this file
-daemon
run in background as daemon
-debug
run in debug mode, gopa will quit with panic error
-log string
the log level,options:trace,debug,info,warn,error (default "info")
-log_path string
the log path (default "log")
-memprofile string
write memory profile to this file
-pidfile string
pidfile path (only for daemon)
-pprof string
enable and setup pprof/expvar service, eg: localhost:6060 , the endpoint will be: http://localhost:6060/debug/pprof/ and http://localhost:6060/debug/varsExample
➜ gopa git:(master) ✗ ./bin/gopa -h
________ ________ __________ _____
/ _____/ \_____ \\______ \/ _ \
/ \ ___ / | \| ___/ /_\ \
\ \_\ \/ | \ | / | \
\______ /\_______ /____| \____|__ /
\/ \/ \/
[gopa] 0.10.0_SNAPSHOT
///last commit: 99616a2, Fri Oct 20 14:04:54 2017 +0200, medcl, update version to 0.10.0 ///
It's safety to press ctrl+c
stop the current running Gopa, Gopa will handle the rest,saving the checkpoint,
you may restore the job later, the world is still in your hand.
If you are running Gopa
as daemon, you may stop it like this:
kill -QUIT `pgrep gopa`
http://127.0.0.1:9000/
http://127.0.0.1:9000/admin/
You use GOPA and you want to be listed there? Contact me.
Released under the Apache License, Version 2.0 .