OpenStreetMaps, Overpass API and Python

By on 17 April 2024

OpenStreetMaps (OSM) is known for being an open source project that allows people to browse the world map and to plan routes. However it is more than that. Among others it provides a read-only API that allows users to query for very diverse map data: Overpass API

Data structure of OSM

To understand the structure of the queries, we first need to understand how OSM saves its data. OSM uses basic data structures: node, way, and relation. Each element can have multiple tags, which consist of a key-value pair and they help specify the features of each element.

Nodes also save a latitude and longitude, indicating where the node is located on the world. Ways contain a list of the nodes contained in the way. Finally, relations are groups of nodes, ways and relations defining a logical relationship for this elements, for example being part of the same street, neighbourhood, city, postal code area, etc.

If we go to https://www.openstreetmap.org/, right-click on some point of the map and then click on Query Features, we get a list of the elements of the area. By clicking on one of them, we can explore it’s characteristics. As an example, we can explore the node 1918872432, which we can see on the left picture. Looking at the data we can see that it is located at longitude 52.5024783 and latitude 13.4155630. On the tags we can see it has the name Oranienplatz and among others it contains a bench and a bin. The key highway contains the value bus_stop, which tells us it is a bus stop.

If we explore the way 1228261444 (right picture) we can see a list of all the nodes it contains. In addition on the key highway we can see it’s of the type footway, telling us it’s a way intended for pedestrians. In addition we can see that it is a lit way.

Many keywords like highway are normalized. As an example, we can find a list of standardized values for highway here.

Overpass Queries

Querying a single element

We can query the previous information using Overpass QL. For the query to get the information about the way 1228261444 we use way(1228261444). The default output format is in xml, so if we want to have it in json, we need to add [out:json] in front of it. Other possible formats are csv, for which we need to define the list of columns for it to work. We mark the end of the query with an out, so the complete query is:

[out:json];
way(1228261444);
out;

To execute the query we need to run it against an instance, thus the complete URL looks like this:

https://overpass-api.de/api/interpreter?data=[out:json];way(1228261444);out;

We can copy the URL directly on the browser, or we can use wget to see the result:

wget -O my_file_name.osm "https://overpass-api.de/api/interpreter?data=[out:json];way(1228261444);out;"

To query a single node or relation works in a parallel way.

Output format

{
  "version": 0.6,
  "generator": "Overpass API 0.7.61.5 4133829e",
  "osm3s": {
    "timestamp_osm_base": "2024-03-06T20:54:14Z",
    "copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
  },
  "elements": [
{
  "type": "way",
  "id": 1228261444,
  "nodes": [
    9010521154,
    9010521155,
    11390038869,
    11390038870,
    9394918823,
    9394918824,
    4613855449,
    11390038871,
    11390038872,
    11390038873,
    9394918822
  ],
  "tags": {
    "footway": "sidewalk",
    "highway": "footway",
    "lit": "yes",
    "surface": "paving_stones"
  }
}
  ]
}

Here we can see two main blocks: the first with metadata like the version or the timestamp, and the second one with a list of the elements found with the query. Even if we queried for one specific element of type way, we still get it as a set formatted as list with length 1.

Querying many elements

If we don’t want the information of one singe way, but the information of all ways in an area, we need to change the query a little bit:

[out:json];
way(52.49505089381484,13.401961459729343,52.509912981033516,13.411502705004548);
out;

Here, instead of requesting one single way, we put 4 coordinates in the parenthesis: boundaries in the south, west, north and east.

This returns a set of all the ways contained in the bounding box defined by the given coordinates.

It is not always needed to know the coordinates to get information for an area. With the query

[out:json];
area[name="Zürich"];
out;

we get all areas called Zürich. In this example we can see that we can use special characters without problem. Most interestingly are both square parentheses behind area [name=”Zürich”]. This is a filter where we only get the areas that contain a tag with key name and value Zürich.

We can use more than one filter, for example to get only the element representing the city Zürich, in contrast to the canton Zürich. In that case we can just place them one after another like this:

[out:json];
area[name=”Zürich”][place=”city”];
out;

We can filter by elements containing a certain tag independently from it’s value.

[out:json];
area[name=”Zürich”][place];
out;

This returns a set of all areas containing the tag place, which in this case contains the same element as above.

Subqueries

We can use the area we got in the previous query to get a set of certain elements inside said area. For example, if we want to get all public toilets in Zürich, we can use the following query:

[out:json];
area[name="Zürich"][place="city"];
node[amenity="toilets"](area);
out;

First we get the area corresponding to the city Zürich, and then we get all nodes whose tag key amenity contains the value toilets. By passing the area between the round parenthesis we limit the set of nodes to the previous area.

The tag amenity describes a large number of important facilities from cafés, public bookcases, benches, vending machines, ATMs, etc.

Other useful tag names to find interesting places are tourism and historic.

TIL: there is also a value for public baking ovens, baking_oven. Apparently there are 2 of those which Zürich, which looks to me like a good reason to visit that city again. Seriously, how awesome is that?

Using overpass in Python

There are two ways to execute overpass queries with python that we will explore. One possibility is using the library and the other the library overpass-api-python-wrapper.

Requests

Example:

import requests

overpass_url = 'https://overpass-api.de/api/interpreter'
overpass_query = '''
[out:json];
area[name="Zürich"][place="city"];
node[amenity="toilets"](area);
out;
'''

response = requests.get(overpass_url, params={'data': overpass_query})

This way we send to the overpass interpreter the query in overpass_query and save the result in the variable response. If no error happened, response.text contains the data as string so we can decode it with:

decoded_data =json.loads(response.text)

If we had requested the information in xml or csv format, we would decode it accordingly.

Overpass-api-python-wrapper

The overpass-api-python-wrapper allows us to execute queries in a more concise and elegant way. First we need to initialize an overpass object.

import overpass 
api = overpass.API(endpoint="https://overpass.myserver/interpreter", timeout=25)

All parameters are optional and the default values are the ones shown here.

query = 'area[name="Zürich"][place="city"];node[amenity="toilets"](area);' 
response = api.get(query, responseformat='json')

By default, response contains a dictionary representing the returned json object, however we can define the output format with the parameter responseformat. With this query, the response is

{
    "features": [
        {
            "geometry": {
                "coordinates": [
                    8.534562,
                    47.350221
                ],
                "type": "Point"
            },
            "id": 75743585,
            "properties": {
                "amenity": "toilets",
                "capacity": "4",
                "check_date:wheelchair": "2022-06-06",
                "fee": "no",
                "name": "Landiwiese",
                "opening_hours": "24/7",
                "operator": "Z\u00fcriWC",
                "unisex": "yes",
                "wheelchair": "no"
            },
            "type": "Feature"
        },
etc
}

If we use xml as format, response will be a string, and with csv it will be a list.

Other Useful tools:

We can find a complete manual on Overpass QL here, where we can read about more awesome possibilities that Overpass has to offer to us.

You can also try out queries with overpass-turbo. However it will only display the returned nodes on the map, but you can also get the returned response as text in the tab Data.

Finally, you can find another cool tutorial for using overpass with python here. In this case, the author shows us new possibilities with overpass, including how to plot the queried data in plots.

I hope you could awake your curiosity about the potential that OpenStreetMaps and Overpass offers. For me in meant not only to learn about a powerful tool, but while exploring it and playing with it it allowed me to learn more about the possibilities our cities and towns have to offer to us. Did you know that there are public showers, including in some train stations? Or that there are public ovens?

What I finally have to say is: keep calm and keep building cool stuff!

Want a career as a Python Developer but not sure where to start?