Parsing Twitter Geo Data and Mocking API Calls by Example

Bob, Sat 17 June 2017, Testing

100days, API, data, geo, mock, pickle, testing, tweepy, twitter, unittest

"Is this Bob or Julian?!" ... yeah tweeting from our shared @pybites Twitter account can be confusing! So I made a little script to parse the location of our tweets. Then I extended it to make it testable. I wrote a decorator to cache a couple of API outputs to be used with the unittest.mock patch decorator I learned about. A simple script turned into a good learning exercise.

Practice leads to new discoveries

That's the cool thing: even a relatively easy exercise like parsing some Twitter data can grow into something more interesting when you extend your goals, in this case: "how to unittest an API?". I will do a dedicated article on mocking when I learn some more, but for now I wanted to share how I went about testing the Twitter API.

1. whotweeted.py

First of all the script: whotweeted: it uses tweepy to get the tweet meta data from the Twitter API and parses the country code (try tweet.place.country_code ...).

If Spain it's me, if Australia it's Julian:

$ python whotweeted.py https://twitter.com/pybites/status/875677559970770944
Bob tweeted it out

$ python whotweeted.py https://twitter.com/pybites/status/875639674244444160
Julian tweeted it out

It raises some exceptions if we input or retrieve bad data. It makes the program longer but more robust:

$ python whotweeted.py https://twitter.com/KirkDBorne/status/876176282542891008
Not a pybites tweet

$ python whotweeted.py https://twitter.com/pybites/status/844092059988508673
Location not set on tweet

$ python whotweeted.py https://twitter.com/pybites/status/844092059988508abc
Problem getting tweet:
[{'code': 144, 'message': 'No status found with that ID.'}]

Note that tweet location is not enabled by default, you have to turn it on, see here.

2. Use mocking to test API calls

This is cool but how can we test our assumptions? We don't want to call the API each time we run our unittests. Enter mocking:

In short, mocking is creating objects that simulate the behaviour of real objects.

I learned about the unittest.mock patch decorator which I use like this:

@patch.object(tweepy.API, 'get_status', return_value=get_tweet('AU'))
...
test
...

@patch.object(tweepy.API, 'get_status', return_value=get_tweet('ES'))
...
another test
...

Test script is here.

This imitates a get_status method call of the tweepy.API object. As return_value I load in one of Julian's/my tweets I pickled to a data directory. Not sure if I could have simplified this by using a library like Faker. As I wanted the full tweepy response object I added a cache decorator in whotweeted to cache (pickle) response data (TODO: put this code in a separate setup script).

The test script is not only much faster (no internet dependency/ latency), you also prevent repeated calls to the API (not sure for Twitter, but some APIs have pretty strict quotas).

To learn more about mocking in Python, checkout the mock object library or if you use pytest see pytest-mock. I have to practice some more with this, I will do a follow-up article on mocking at some point ...


Keep Calm and Code in Python!

-- Bob

PyBites Python Tips

Do you want to get 250+ concise and applicable Python tips in an ebook that will cost you less than 10 bucks (future updates included), check it out here.

Get our Python Tips Book

"The discussions are succinct yet thorough enough to give you a solid grasp of the particular problem. I just wish I would have had this book when I started learning Python." - Daniel H

"Bob and Julian are the masters at aggregating these small snippets of code that can really make certain aspects of coding easier." - Jesse B

"This is now my favourite first Python go-to reference." - Anthony L

"Do you ever go on one of those cooking websites for a recipe and have to scroll for what feels like an eternity to get to the ingredients and the 4 steps the recipe actually takes? This is the opposite of that." - Sergio S

Get the book