Distributed crawler powered by Headless Chrome
previousUrl
to onSuccess
argument.options
, depth
, previousUrl
to errors.customCrawl
for HCCrawler.connect() and HCCrawler.launch()'s options.newpage
event.requestfinished
event's argument as described in the API reference.cookies
for crawler.queue()'s options.onSuccess
pass cookies
in the response.viewport
and skipRequestedRedirect
for crawler.queue()'s options.requestdisallowed
event.onSuccess
pass redirectChain
in the response.waitFor
for crawler.queue()'s options.slowMo
for HCCrawler.connect()'s options.timeout
option per request.newpage
event.deniedDomains
and depthPriority
for crawler.queue()'s options.allowedDomains
option to accept a list of regular expressions.followSitemapXml
for crawler.queue()'s options.requestretried
event.cache
option not only for remembering already requested URLs but for request queue for distributed environments.onSuccess
, onError
and maxDepth
options from HCCrawler.connect() and HCCrawler.launch() to crawler.queue().