🕸️ Scrape facebook group post permalinks
...with puppeteer and MutationObserver
This is a one (now three) night hack that I used to scrape 8K+ permalink ids from a secret facebook group we use to share photos with family. It became an annoyance that there was no way to search posts by date and manually scrolling back over 2-3 years is not an option.
See api.js that I'm yet to document here but pretty straightforward. You just need the numeric group id and an access token. (See also Graph API Explorer here)
The API also supports specifying date ranges as UNIX timestamps (e.g. ?since=1420070400&until=1430070400
) so there's no need to paginate through the whole feed to get to dates years ago.
PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true yarn
PUPPETEER_SKIP_CHROMIUM_DOWNLOAD to skip downloading Chromium since we'll use your default Chrome anyway
node index.js <groupid> | tee permalinks.csv
<ISO date>, <facebook post URL>
(one per line).This script: