One of the biggest jumps you make in your Python learning is when you start dealing with external data.
With this post we wanted to demonstrate a few ways you can work with the more common data formats. Why? Because it’s a big deal when you’re starting out! Furthermore, unless you do it often enough it’s easy to forget how so bookmark this baby and reference it!
The links below are to articles and scripts we’ve actually written as well as to external resources we’ve found helpful.
If you’re going to play with CSV files,
DictReader is your friend. It converts each row into an
Reading the contents of a CSV file:
for entry in csv.DictReader(f, fieldnames=FIELDS): yield entry
Opening and reading the CSV using a
def read_csv(cf=CSV_FILE): with open(cf, 'r') as csvfile: return list(csv.DictReader(csvfile))
JSON is a must these days, especially if you want to work with APIs.
Simple read of JSON data pulled down by
data = json.loads(r.text)
One of our first articles used a
with statement to load in JSON data:
def load_json(json_file): with open(json_file) as f: return json.loads(f.read())
Our Challenge 07 review used
yield to return the JSON data:
def get_tweets(input_file): with open(input_file) as f: for line in f.readlines(): yield json.loads(line)
.json() method on
data = requests.get(API_URL.format(city, API_KEY)).json()
- You can use
dumpto write to a file as per this Stack Overflow question.
We’ve learned to love SQLite recently and have found ourselves using it all the time. It’s worth picking up as it’s such an easy and great way of getting a persistent DB!
Recent use to convert a CSV of movies to an
We enjoyed this thorough
sqlitePython tutorial by Sebastian Raschka too.
XML! The data format of choice for RSS feeds. Can be a bit troublesome at times but always worth the effort.
Example of using
xml.etree.ElementTree to parse the Safari RSS feed:
Code Link - Worth checking out the full code but the gist of it is…
for item in doc.iterfind('channel/item'): ...
feedparser to pull specific XML tags and add to a list:
feed = feedparser.parse(FEED_FILE) for entry in feed['entries']: Game = (entry['title'], entry['link']) games_list.append(Game)
We’ve had numerous challenges over the past few months where the solutions involved these data formats. Here are a few of the noteworthy ones:
This was definitely a great challenge. Check out the multiple community contributions for some examples of using
sqlite and XML in functional scripts written by your fellow Pythonistas.
Learn By Doing
Now that you have the info, as we said in our Learn By Doing article, open up a vim session and get coding!
One awesome, shameless plug of a way to do this would be to come up with a solution for Code Challenge 19. Playing with an API means you’ll more than likely need to use quite a few of these formats.
We’d love to hear if you have any Pythonic tips on using these formats too so leave a comment!
And as always, Keep Calm and Code in Python!
-- Julian and Bob
See an error in this post? Please submit a pull request on Github.