Datajournalism Resources Save Abandoned

A compilation of links to datajournalism & OSINT tools, guides and resources I find useful to keep at hand.

Project README

datajournalism-resources

A compilation of links to datajournalism & OSINT tools, guides and resources I find useful to keep at hand. PRs welcomed!

by r3mlab | License: CC-BY-NC 4.0

Legend:

  • :globe_with_meridians: = online tool/service/database
  • :computer: = software
  • :book: = guide/tutorial
  • :pencil: = list of tools/resources
  • :snake: = Python module
  • ? = paid or paid-only tool/service

Contents

APIs

  • Postman :computer: - API development environment offering useful tools for crafting and debugging API requests.
  • ProgrammableWeb :pencil: - A good API directory.
  • Public APIs :pencil: - A categorized list of APIs.

Archival

Breached Data

  • Breach Data Search Engines Comparison :pencil: (IntelTechniques)
  • CardPwn :computer: - Find out if a credit card number appears in a breach.
  • Dehashed :globe_with_meridians:? - Find cleartext & hashed password from data breaches (paid, $4/week, $11/mo).
  • GhostProject :globe_with_meridians: - Check if an email appears in a breach. Shows the first 3 characters of the password for free.
  • h8mail :computer: - Find passwords through different breach and reconnaissance services. Can also search the BreachedCompilation torrent.
  • Have I Been Pwned? :globe_with_meridians: - Check if an email appears in a breach, set up alerts.
  • pwndb.py :computer: - Command-line tool for searching leaked credentials using the Onion service with the same name.
  • WhatBreach :computer: - Search for breached emails and their corresponding database.

Companies

  • CompaniesHouse Short Guide :book: (Bellingcat) - A guide about the UK online company registry.
  • DocumentCloud Search :globe_with_meridians: - Search public documents uploaded to DocumentCloud, a publishing plateform used by many journalists and media.
  • ICIJ's Offshore Leaks Database :globe_with_meridians: - Data on offshore companies, foundations and trusts from the Panama Papers, the Offshore Leaks, the Bahamas Leaks and the Paradise Papers investigations.
  • List of company registers :pencil: (Wikipedia) - A list of all companies registers, by country.
  • OCCRP Data :globe_with_meridians: - Fantastic search tool & resources made available by OCCRP. Public records, leaks, scraped business registers, and more.
  • OCCRP Investigative Dashboard :pencil: - Collection of the most useful public data sources for investigative reporting. Many business registries listed.
  • OpenCorporates :globe_with_meridians: - A very comprehensive companies database. Has an API.
  • Open Ownership Register :globe_with_meridians: - Explore beneficial ownership data. Aggregates many datasets.

Data Analysis & Manipulation

See also: Visualization

  • csvkit :computer: - A suite of command-line tools for converting to and working with CSV files.
  • OpenRefine :computer: - Clean & transform messy data.
  • pandas :snake: - Powerful Python data analysis library. Best used in a Jupyter notebook.

Email

See also: Breached Data

  • emailrep.io :globe_with_meridians: - Public email reputation search & API. Can find social media profiles.
  • Infoga :computer: - Gather email accounts information (ip, hostname, country, etc) from different public sources.
  • theHarvester :computer: - Python command-line tool to search several search engines for mail addresses from a particular domain.
  • The most complete guide to finding anyone's email :book: (Blurbiz)
  • Trumail :globe_with_meridians: - Free email verification API.

Lists of tools & resources

Location, Maps, Satellite Imagery

Interpretation

Mapping services & software

Tools & techniques

User generated content

See also: Social Networks

  • EchoSec :globe_with_meridians:? - Search and analyze social media data based on location. ($499/mo)
  • GeoCreepy :computer: - Geolocation information gathering through social networking platforms (discontinued).
  • Kamerka :computer:- Create an interactive map of cameras, printers, tweets and photos based on your coordinates.
  • OpenStreetMap :globe_with_meridians: - User generated locations & maps. Use taginfo and/or overpass-turbo.eu to search a location by key/value tags (see OSM's Wiki)
  • Mapillary :globe_with_meridians: - Interactive map of crowdsourced geotagged photographs.
  • OpenStreetCam :globe_with_meridians: - Map of crowdsourced street-level photographs.
  • Social networks (see category)
  • Surveillance under Surveillance :globe_with_meridians: - User-contributed map of cameras and guards.
  • Tourism & review websites: Foursquare, TripAdvisor, Yelp, etc. :globe_with_meridians:
  • Vkontakte :globe_with_meridians: - Use near:<coordinates> in a search.
  • Wikimapia :globe_with_meridians: - User-generated locations & descriptions. Has an API.

Military/Weapons

Multi-purpose tools

  • Buscador :computer: - A very handy VM with plenty of pre-installed & pre-configured OSINT tools.
  • DataSploit :computer: - A collection of python scripts which automates open source intelligence searches about domain names, email addresses, IP addresses and usernames.
  • IntelligenceX Tools :globe_with_meridians: - Various search, email and domain tools.
  • Maltego CE :computer: - Interactive data mining & mapping tool.
  • Spiderfoot :computer: - Open source intelligence automation tool. Gathers intelligence about a given target, which may be an IP address, domain name, hostname, network subnet, ASN, e-mail address or person's name.

News

  • AllYouCanRead :pencil: - Database of news outlets by country.
  • NewsLookup :globe_with_meridians: - News search engine with useful filters.
  • NewsNow :globe_with_meridians: - News search engine with useful filters.
  • NewspaperMap :globe_with_meridians: - Newspapers world map with feeds and automatic translation.

Phone numbers

Pictures, Photos, Videos

Pictures Metadata

  • Bing Images :globe_with_meridians: - Can search part of an image by resizing on the fly.
  • CitizenEvidence :globe_with_meridians: - Google Images reverse search on Youtube thumbnails.
  • EagleEye :computer: - Find Instagram, FB and Twitter profiles using image recognition and reverse image search.
  • Google Images :globe_with_meridians:
  • Search by Image :computer: - Browser extension to quickly reverse-search an image on 20+ search engines.
  • TinEye :globe_with_meridians:
  • Yandex Images :globe_with_meridians:
  • How to Conduct Comprehensive Video Collection (Bellingcat) :book:
  • PimEyes :globe_with_meridians: - Face-recognition matching search engine.
  • SearchFace.ru :globe_with_meridians: - Face recognition search engine for the Russian VK social network. See this guide from Bellingcat for a tutorial.
  • SocialMapper :globe_with_meridians: - Social Media Mapping Tool that correlates profiles via facial recognition. Supports LinkedIn, Facebook, Twitter, Instagram, VKontakte, Weibo, Douban.

Verification & Analysis

Social Networks

All/General

  • EagleEye :computer: - Find Instagram, FB and Twitter profiles using image recognition and reverse image search.
  • HashAtIt :globe_with_meridians: - Hashtag search across Twitter, Instagram, Pinterest, Facebook and Youtube.
  • Sherlock :computer: - Search for a username across 135 social media sites.
  • SocialMapper :globe_with_meridians: - Social Media Mapping Tool that correlates profiles via facial recognition. Supports LinkedIn, Facebook, Twitter, Instagram, VKontakte, Weibo, Douban.
  • WhatsMyName :computer: - Search for usernames on 180+ web sites.

Discord

  • dis.cool :globe_with_meridians: - Discord search engine.

Facebook

  • fb-search :globe_with_meridians: - Simple Graph query crafter. Made after Facebook sudden closure of Graph Search.
  • FFFF Finds Facebook Friends :computer: - Builds a relationship graph of a target user. Partially reconstructs hidden friend lists. :fire:.

Github

  • gitrob :computer: - Find potentially sensitive files pushed to public repositories on Github. Requires a GitHub access token.
  • Zen :computer: - Find emails of Github users.

Instagram

  • instaloader :computer: - Download pictures (or videos) along with their captions and other metadata from Instagram.
  • instagram-scraper :computer: - Scrape a user's photos and videos.
  • searchmybio :globe_with_meridians: - Search Instagram users biographies.

Linkedin

Reddit

  • Reddit Comment Search :globe_with_meridians: - Search through comments of a particular reddit user.
  • Reddit Insight :globe_with_meridians: - Collect info on a Reddit profile, list all posts & comments.
  • Reddit Investigator :globe_with_meridians: - Collect info on a Reddit profile.
  • Reddit Search :globe_with_meridians: - Reddit search engine with filters.
  • ReSavr :globe_with_meridians: - Search deleted comments.

Snapchat

  • Snapdex :globe_with_meridians: - Searchable database of Snapchat usernames.
  • Snap Map :globe_with_meridians: - Official Snapchat map.

Telegram

  • Buzz.im :globe_with_meridians: - Search in open telegram messages.
  • Lyzem :globe_with_meridians: - Telegram search engine.
  • Telegago :globe_with_meridians: - Google Custom Search Engine for Telegram users & content. Can discover private groups.
  • tlgrm.eu :globe_with_meridians: - Search for Telegram channels.
  • tgstat.ru :globe_with_meridians: - Telegram analytics & seach tool.

Twitter

  • DMI-TCAT :computer: - PHP web interface to retrieve and analyze tweets.
  • SocialBearing :globe_with_meridians: - Statistics on keywords, hashtags, users.
  • SpoonBill :globe_with_meridians: - Track changes in Twitter profiles & bios. Requires a Twitter account.
  • tinfoleak :computer: - Very complete open-source tool for Twitter intelligence analysis. Needs API credentials.
  • twarc :computer::snake: - A command line tool and Python library for archiving Twitter in JSON format.
  • Tweetdeck :globe_with_meridians:
  • Tweetdeck Location Search Tutorial :book:
  • Tweet Map :globe_with_meridians: - Explore the world and find geo-tagged tweets.
  • Tweets Analyzer :computer: - Twitter profile analyzer with tweet activity charts, locations, most used hashtags, etc. Can save tweets to JSON. Requires a Twitter API key.
  • tweetsmapper :computer: - Generates a Leaflet map for a given user or from an existing collection of tweets. Can retrieve full timelines.
  • TWINT (Twitter Intelligence Tool) :computer: - Advanced Twitter scraping tool, no API key needed. Can export to text, CSV, JSON, SQLite, Elasticsearch. Can detect emails, phone numbers, profiles.
  • Who Tweeted It First? :globe_with_meridians: - Find out who was the first person who tweeted a link, video, quote or any piece of text.

VKontakte

  • SnRadar :globe_with_meridians: - Search VKontakte content by location.

Youtube

  • Unlisted Videos :globe_with_meridians: - Search & submit unlisted YouTube videos. No registration required.

Text & Documents

Documents metadata

  • Apache Tika :computer: - Extract metadata and text from over a thousand different file types.
  • FOCA :globe_with_meridians::computer: - Find metadata and hidden information in Microsoft Office, Open Office, or PDF files.
  • ICIJ Extract :computer: - A command line tool for parallelized, distributed content-extraction.

Indexing & searching

  • Aleph :computer: - A toolkit for data search, management and analysis in investigative reporting.
  • Blacklight :computer: - Open source Solr user interface discovery platform.
  • Datashare :computer: - Index & search documents on your computer, automatically detect people, organizations and locations with NLP.
  • DumpsterDiver :computer: - Analyze big volumes of various file types in search of secrets, credentials, etc.
  • ICIJ Extract :computer: - A command line tool for parallelized, distributed content-extraction.
  • searchbox :computer: - A simple out-of-the-box web interface to search through thousands of unstructured documents using Solr.

OCR

  • NewOCR.com :globe_with_meridians: - Recognizes several languages. Can resize images & has shortcuts to Google & Bing Translate.
  • Tesseract :computer: - Open-source OCR engine.

PDF

  • PDF Text Extraction with PyPDF2, Tika & PDF Miner. :computer:
  • tabula :computer: - Tool for liberating data tables trapped inside PDF files.

Text Processing & Analysis

  • topia :snake: - Python module to determine important terms within a given piece of content.
  • TXM :computer: - Lexicometry and text statistical analysis for large bodies of text.

Transportation

Containers & Shipments

  • BIC Code Register :globe_with_meridians: - Business Identifier Codes lookup. The website also has other search tools and useful information on container markings.
  • Prefix List :globe_with_meridians: - Find the owner of a container from its prefix.
  • track-trace :globe_with_meridians: - Track parcels/shipments, air cargo, containers and post.

Planes

Ships

Visualization

Graphs

  • Data Visualisation Catalogue :book: - Find which visualisation is right for what you want to show. Plenty of tips & resources.
  • DataWrapper :globe_with_meridians:? - Easy to use graph & map tool. Free plan available.
  • Google Fusion Tables - Create maps & charts from data. Will shut down on Dec. 2019.
  • Matplotlib :snake: - Python 2D plotting library. Best used with pandas in a Jupyter notebook.
  • RawGraph :globe_with_meridians::computer: - Generate static graphs through a very user-friendly interface. Can be run locally.

Maps

  • ArcGIS :computer:? - Mapping & analysis software (proprietary, paid, 21-day trial)
  • Folium :snake: - Python library to create Leaflet.js maps. Can be used in a Jupyter Notebook to map data from pandas.
  • Geopy :snake: - Python geocoding library. Supports OSM Nominatim, Google, Bing, GeoNames & many more.
  • Google:
  • Humanitarian Data Exchange :globe_with_meridians: - Useful resources of shapefiles, especially for administrative boundaries.
  • KML Interactive Sampler :globe_with_meridians: - Lots of KML templates.
  • QGIS :computer: - Free & open-source alternative to ArcGis.

Mindmaps & Network graphs

Timelines

  • Tik Tok :computer: - Javascript tool to easily create simple, mobile-friendly, vertical timelines. Open-source.
  • TimelineJS :computer:

Weather

  • timeanddate.com :globe_with_meridians: - Weather history.
  • Ventusky :globe_with_meridians: - Live & past wind, rain and temperature maps.
  • Wolfram Alpha :globe_with_meridians: - Weather history. What was the weather in New York on January 1st 2017?
  • Wunderground History :globe_with_meridians: - Weather history

Websites

See also: Archival

Dark Web & Onion services

Scraping

Misc

  • awesome-selfhosted :pencil: - A list of Free Software network services and web applications which can be hosted locally
  • grayhatwarfare :globe_with_meridians: - Search open Amazon S3 buckets content.
  • Shodan :globe_with_meridians: - Internet of Things search engine
  • World License Plates :globe_with_meridians: - Pictures of license plates from all around the world.

License

This list is under the Creative Commons Attribution-NonCommercial 4.0 International Public License License.

Open Source Agenda is not affiliated with "Datajournalism Resources" Project. README Source: r3mlab/datajournalism-resources
Stars
78
Open Issues
0
Last Commit
4 years ago

Open Source Agenda Badge

Open Source Agenda Rating