Scrapy+Splash for JavaScript integration
Removed official support for Python 2.7, 3.4, 3.5 and 3.6, and added official support for Python 3.9, 3.10 and 3.11.
Deprecated SplashJsonResponse.body_as_unicode()
, to be replaced by
SplashJsonResponse.text
.
Removed calls to obsolete to_native_str
, removed in Scrapy 2.8.
Security bug fix:
If you use HttpAuthMiddleware
(i.e. the http_user
and http_pass
spider attributes) for Splash authentication, any non-Splash request will expose your credentials to the request target. This includes robots.txt
requests sent by Scrapy when the ROBOTSTXT_OBEY
setting is set to True
.
Use the new SPLASH_USER
and SPLASH_PASS
settings instead to set your Splash authentication credentials safely.
Responses now expose the HTTP status code and headers from Splash as response.splash_response_status
and response.splash_response_headers
(#158)
The meta
argument passed to the scrapy_splash.request.SplashRequest
constructor is no longer modified (#164)
Website responses with 400 or 498 as HTTP status code are no longer handled as the equivalent Splash responses (#158)
Cookies are no longer sent to Splash itself (#156)
scrapy_splash.utils.dict_hash
now also works with obj=None
(225793b)
Our test suite now includes integration tests (#156) and tests can be run in parallel (6fb8c41)
There’s a new ‘Getting help’ section in the README.rst
file (#161, #162), the documentation about SPLASH_SLOT_POLICY
has been improved (#157) and a typo as been fixed (#121)
Made some internal improvements (ee5000d, 25de545, 2aaa79d)