earn the White PyBites Ninja earn the Yellow PyBites Ninja earn the Orange PyBites Ninja right arrow earn more PyBites Ninja belts and certificates
The best way to learn to code in Python is to actually use the language.

Our platform offers effective Test Driven Learning which will be key to your progress.

Join thousands of Pythonistas and start coding!

Join us on our PyBites Platform
Click here to code!

PyBites Module of the Week - Requests-cache for Repeated API Calls

Posted by Bob on Tue 14 March 2017 in Modules • 2 min read

Today a quick article on a nice caching module when working with APIs: Requests-cache.

I stumbled upon this excellent article by RealPython when looking for a solution to limit API requests. I needed this when I was playing with the Github API to check changes to forks of our Challenges repo (you can also see this in the repo, under Graphs > Network, but I was just playing around).

This is not a script that would typically need caching, because I probably would run it once a week and then it would make just a couple of requests (at this time: ~100 forks / 30 results per call). However when I was coding this up, I did not want to call the API over and over again:

For unauthenticated requests, the rate limit allows you to make up to 60 requests per hour. Github API documentation

It was also a good exercise to test this module out for a future use case where this does matter.

Using requests_cache

First I thought: lets write the output to a file. However that adds more code. Maybe use a decorator to sleep between requests? However that slows down my coding/testing. As usual somebody already invented the wheel.

Enter Requests-cache. It has an easy / friendly interface:

import requests_cache

requests_cache.install_cache('cache_filename', backend='backend', expire_after=expiration_in_seconds)

Verify with curl

  • Start API rate limit (already did some calls):

    $ curl -i https://api.github.com/users/whatever 2>/dev/null |grep 'X-RateLimit-Remaining:'
    X-RateLimit-Remaining: 42
  • First time around: cache result. DB got created. Cost = 6 calls (1x curl, 5x by script)

    $ python commits.py 2>&1 > /dev/null
    $ lt cache.sqlite
    -rw-r--r--  1 bbelderb  staff   516K Mar 14 08:03 cache.sqlite
    $ curl -i https://api.github.com/users/whatever 2>/dev/null |grep 'X-RateLimit-Remaining:'
    X-RateLimit-Remaining: 36
  • Second call = cached, cost down to 1 (= curl)

    $ python commits.py 2>&1 > /dev/null
    $ curl -i https://api.github.com/users/whatever 2>/dev/null |grep 'X-RateLimit-Remaining:'
    X-RateLimit-Remaining: 35

Keep in mind

Two noteworthy things that were commented on mentioned article:

  • Check the documentation of the API you are working with. Maybe they already provide a way to use caching. In case of the GH API this would be Conditional requests:

    Making a conditional request and receiving a 304 response does not count against your Rate Limit, so we encourage you to use it whenever possible.

    Something to try on the next iteration ...

  • You might want to define an output directory for the cache file instead of the default current directory to not end up with multiple files if working from a different folder.

More info

See the module's documentation for more info.

Have you used this module? And/or what do you use for caching API requests?

Keep Calm and Code in Python!

-- Bob

See an error in this post? Please submit a pull request on Github.