A selector-based html snapshot tool using Puppeteer or PhantomJS that sources sitemap.xml, sitemap-index, robots.txt, or arbitrary input
Breaking Changes:
Other updates:
Drops support for node 14 and update dependencies.
Added puppeteer as the default browser.
Added browser
option to allow selection of "phantomjs".
Added debug
option to run puppeteer headed with devtools.
request
in favor of the got
library.robots
input types now search for Sitemap
directives and favor those over any other information in the robots.txt file. If Sitemap
directives are found, those alone are used to drive crawling of the site. If no Sitemap
directives are found, it will fallback to Allow
directives as in previous versions.Supports Node 14+ (dropped 10 & 12) Select dependency update, copyright 2022.
This is a release with major and breaking changes.
The readme has the scoop, but here's the TL;DR:
run
method now returns a Promiserun
methodrun
Promise will also resolve. The callback argument is deprecated and may be removed at some future date.A Promise failure handler will receive an Error instance that contains all of the errors that have occurred. Also, the Error instance contains additional properties that contain useful information:
completed
- An array of the file output paths that actually completed and were written to storage.notCompleted
- An array of the file output paths that did not complete and were not written to storage.