Marple Save

📖 Collect links to profiles by username through search engines and analyze with various plugins

Project README

Marple

Summary

Collect links to profiles by username through 10+ search engines (see the full list below).

Features:

multiple engines
proxy support
CSV file export
plugins
- pdf metadata extraction
- social media info extraction

Quick Start

./marple.py soxoj

Results:

https://t.me/soxoj
Contact @soxoj - Telegram

https://github.com/soxoj
soxoj - GitHub

https://coder.social/soxoj
soxoj - Coder Social

https://gitmemory.com/soxoj
soxoj

...

PDF files
https://codeby.net/attachments/v-0-0-1-social-osint-fundamentals-pdf.45770
Social OSINT fundamentals - Codeby.net
/Creator: Google

...

Links: total collected 111 / unique with username in URL 97 / reliable 38 / documents 3

Advanced usage:

./marple.py soxoj --plugins metadata

./marple.py smirnov --engines google baidu -v

Installation

All you need is Python3. And pip. And requirements, of course.

pip3 install -r requirements.txt

You need API keys for some search engines (see requirements in Supported sources). Keys should be exported to env in this way:

export YANDEX_KEY=key

Options

You can specify 'junk threshold' with option -t or --threshold (default 300) to get more or less reliable results.

Junk score is summing up from length of link URL and symbols next to username as a part of URL.

Also you can increase count of results from search engines with option --results-count (default 1000). Currently limit is only applicable for Google.

Other options:

  -h, --help            show this help message and exit
  -t THRESHOLD, --threshold THRESHOLD
                        Threshold to discard junk search results
  --results-count RESULTS_COUNT
                        Count of results parsed from each search engine
  --no-url-filter       Disable filtering results by usernames in URLs

  --engines {baidu,dogpile,google,bing,ask,aol,torch,yandex,naver,paginated,yahoo,startpage,duckduckgo,qwant}
                        Engines to run (you can choose more than one)

  --plugins {socid_extractor,metadata,maigret} [{socid_extractor,metadata,maigret} ...]
                        Additional plugins to analyze links

  -v, --verbose         Display junk score for each result
  -d, --debug           Display all the results from sources and debug messages
  -l, --list            Display only list of all the URLs
  --proxy PROXY         Proxy string (e.g. https://user:[email protected]:8080)
  --csv CSV             Save results to the CSV file

Supported sources

Name	Method	Requirements
Google	scraping	None, works out of the box; frequent captcha
DuckDuckGo	scraping	None, works out of the box
Yandex	XML API	Register and get YANDEX_USER/YANDEX_KEY tokens
Naver	SerpApi	Register and get SERPAPI_KEY token
Baidu	SerpApi	Register and get SERPAPI_KEY token
Aol	scraping	None, scrapes with pagination
Ask	scraping	None, scrapes with pagination
Bing	scraping	None, scrapes with pagination
Startpage	scraping	None, scrapes with pagination
Yahoo	scraping	None, scrapes with pagination
Mojeek	scraping	None, scrapes with pagination
Dogpile	scraping	None, scrapes with pagination
Torch	scraping	Tor proxies (socks5://localhost:9050 by default), scrapes with pagination
Qwant	Qwant API	Check if search available in your exit IP country, scrapes with pagination

Development & testing

$ python3 -m pytest tests

TODO

Proxy support
Engines choose through arguments
Exact search filter
Engine-specific filters
'Username in title' check

Mentions and articles

Sector035 - Week in OSINT #2021-50

OS2INT - MARPLE: IDENTIFYING AND EXTRACTING SOCIAL MEDIA USER LINKS

Open Source Agenda is not affiliated with "Marple" Project. README Source: soxoj/marple

Stars

189

Open Issues

Last Commit

9 months ago

Repository

soxoj/marple

License

MIT

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/marple"><img src="https://www.opensourceagenda.com/projects/marple/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022