Minimalist Requests wrapper to work within rate limits of any amount of services simultaneously. Parallel processing friendly.
If you know Python, you know Requests. Requests is love. Requests is life. Depending on your use cases, you may come across scenarios where you need to use Requests a lot. Services you consume may have rate-limiting policies in place or you may just happen to be in a good mood and feel like being a good Netizen. This is where requests-respectful can come in handy.
requests-respectful:
Typical requests call
import requests
response = requests.get("http://github.com", params={"foo": "bar"})
Magic requests-respectful call - requests verb methods are proxied!
from requests_respectful import RespectfulRequester
rr = RespectfulRequester()
# This can be done elsewhere but the realm needs to be registered!
rr.register_realm("Github", max_requests=100, timespan=60)
response = rr.get("http://github.com", params={"foo": "bar"}, realms=["Github"], wait=True)
Conservative requests-respectful call - pass a lambda with a requests method call
import requests
from requests_respectful import RespectfulRequester
rr = RespectfulRequester()
# This can be done elsewhere but the realm needs to be registered!
rr.register_realm("Github", max_requests=100, timespan=60)
request_func = lambda: requests.get("http://github.com", params={"foo": "bar"})
response = rr.request(request_func, realms=["Github"], wait=True)
pip install requests-respectful
{
"redis": {
"host": "localhost",
"port": 6379,
"database": 0
},
"safety_threshold": 10,
"requests_module_name": "requests"
}
host
, port
and database
of the Redis instanceThe library auto-detects the presence of a YAML file named requests-respectful.config.yml at the root of your project and will attempt to load configuration values from it.
Example:
requests-respectful.config.yml
redis:
host: 0.0.0.0
port: 6379
database: 5
safety_threshold: 25
If you don't like having an extra file lying around, the library can also be configured at runtime using the configure() class method.
RespectfulRequester.configure(
redis={"host": "0.0.0.0", "port": 6379, "database": 5},
safety_threshold=25
)
In both cases, the resulting active configuration would be:
RespectfulRequester._config()
Out[1]: {
"redis": {
"host": "0.0.0.0",
"port": 6379,
"database": 5
},
"safety_threshold": 25,
"requests_module_name": "requests"
}
In your quest to use requests-respectful, you should only ever have to bother with one class: RespectfulRequester. Instance this class and you can perform all important operations.
Before each example, it is assumed that the following code has already been executed.
from requests_respectful import RespectfulRequester
rr = RespectfulRequester()
Realms are simply named containers that are provided with a maximum requesting rate. You are responsible of the management (i.e. CRUD) of your realms.
Realms track the HTTP requests that are performed under them and will raise a catchable rate limit exception if you are over their allowed requesting rate.
rr.fetch_registered_realms()
This returns a list of currently registered realm names.
rr.register_realm("Google", max_requests=10, timespan=1)
rr.register_realm("Github", max_requests=100, timespan=60)
rr.register_realm("Twitter", max_requests=150, timespan=300)
# OR
realm_tuples = [
["Google", 10, 1],
["Github", 100, 60],
["Twitter", 150, 300]
]
rr.register_realms(realm_tuples)
Either of these registers 3 realms:
rr.update_realm("Google", max_requests=25, timespan=5)
This updates the maximum requesting rate of Google to 25 requests per 5 seconds.
rr.realm_max_requests("Google")
This would return 25.
rr.realm_timespan("Google")
This would return 5.
rr.unregister_realm("Google")
This would unregister the Google realm, preventing further queries from executing on it.
rr.unregister_realms(["Google", "Github", "Twitter"])
This would unregister all 3 realms in one operation, preventing further queries from executing on them.
The library supports proxying calls to the 7 Requests HTTP verb methods (DELETE, GET, HEAD, OPTIONS, PATCH, POST, PUT). This is literally a Requests method so go crazy with your params, body, headers, auth etc. kwargs. The only major difference is that a realm kwarg is expected. A wait boolean kwargs can also be provided (the behavior is explained later).
These are all valid calls:
rr.get("http://httpbin.org", realms=["HTTPBin"])
rr.post('http://httpbin.org/post', data = {'key':'value'}, realms=["HTTPBin"], wait=True)
rr.put('http://httpbin.org/put', data = {'key':'value'}, realms=["HTTPBin"])
rr.delete('http://httpbin.org/delete', realms=["HTTPBin"])
If not rate-limited, these would return your usual requests.Response object.
If you are a purist and prefer not using fancy proxying, you are also allowed to create a lambda of your Requests call and pass it to the request() instance method.
request_func = lambda: requests.post('http://httpbin.org/post', data = {'key':'value'})
rr.request(request_func, realms=["HTTPBin"], wait=True)
If not rate-limited, this would return your usual requests.Response object.
Starting in 0.2.0, you can have a single request count against multiple realms. The kwarg has been changed from realm
to realms
and works as you would expect it to.
rr.get("http://httpbin.org", realms=["HTTPBin", "HTTPBinUser123", "HTTPBinServer3"])
The kwarg realm
has been deprecated on requesting instance methods. It will still work with a warning until 0.3.0
Executing these calls will either return a requests.Response object with the results of the HTTP call or raise a RequestsRespectfulRateLimitedError exception. This means that you'll likely want to catch and handle that exception.
from requests_respectful import RequestsRespectfulRateLimitedError
try:
response = rr.get("http://httpbin.org", realm="HTTPBin")
except RequestsRespectfulRateLimitedError:
pass # Possibly requeue that call or wait.
Both ways of requesting accept a wait kwarg that defaults to False. If switched on and the realm is currently rate-limited, the process will block, wait until it is safe to send requests again and perform the requests then. Waiting is perfectly fine for scripts or smaller operations but is discouraged for large, multi-realm, parallel tasks (i.e. Background Tasks like Celery workers).
Yes
Yes
Yes - Redis calls aren't mocked and google.com gets a few friendly calls
Run them with python -m pytest tests --spec
Yes. The use of Redis allows for requests-respectful to go multi-thread, multi-process and even multi-machine while still respecting the maximum requesting rates of registered realms. Operations like Redis' SETEX are key in designing and working with rate-limiting systems. If you are doing Python development, there is a decent chance you already work with Redis as it is one of the two options to use as Celery's backend and one of the 2 major caching options in Web development. If not, you can always keep things clean and use a Docker Container or even build it from source. Redis has kept a consistent record over the years of being lightweight, solid software.
request()...time.sleep(interval)
. This one will allow to send as many as you want, as fast as you want, as long as you are under the maximum requesting rate of your realm.