How to Use Pdb to Debug Your Code

Bob, Tue 24 October 2017, Modules

The larger part of our coding time is spent reading and debugging code already written. For this Python's pdb is an unmissable module in your Python toolbox. In this article I show you the most common options and some practical examples.

How to invoke the debugger?

You can invoke it as a script which puts you right at the start:

$ python -m pdb
> /Users/bbelderb/code/<module>()
-> from urllib.request import urlopen

More commonly you want to break into the debugger from a running program. To do this use this one-liner at the location where you want to start debugging:

> import pdb; pdb.set_trace()

Note that Python 3.7 improves this adding a new built-in function called breakpoint() - see PEP 553

Common switches

There are many!

You probably will use only a few though and pdb lets you conveniently use their one letter shortcuts.

The difference between next (n) and step (s) is that step stops inside a called function, while next executes called functions at (nearly) full speed, only stopping at the next line in the current function.

Single letter variables are bad for code readability, but the clash with common pdb shortcuts is another reason to avoid them at all costs.

Hello World example

OK enough theory let's write some code:

def sum(val1, val2):
    val2 = 0
    newval = val1 + val2
    return newval

values = range(1, 11)
total = 0

import pdb; pdb.set_trace()

for val in values:
    val = sum(val, 1)
    total += val

assert total == 65

Yes it's silly and the bug is obvious, but the goal is to show pdb.

As you see I already set the breakpoint. When I run this code it drops into the debugger. It shows me the next line to be executed:

> /Users/bbelderb/code/<module>()
-> for val in values:

I can print variables:

(Pdb) values
range(1, 11)

Stepping through the for loop:

> /Users/bbelderb/code/<module>()
-> for val in values:
(Pdb) n
> /Users/bbelderb/code/<module>()
-> val = sum(val, 1)
(Pdb) val
(Pdb) n
> /Users/bbelderb/code/<module>()
-> total += val
(Pdb) val
(Pdb) n
> /Users/bbelderb/code/<module>()
-> for val in values:
(Pdb) n
> /Users/bbelderb/code/<module>()
-> val = sum(val, 1)
(Pdb) val
(Pdb) l
9   import pdb; pdb.set_trace()
11      for val in values:
12          val = sum(val, 1)
13  ->      total += val
15      assert total == 65

You can move the breakpoint to the function but as it is little code I just use s(tep):

> /Users/bbelderb/code/<module>()
-> for val in values:
(Pdb) n
> /Users/bbelderb/code/<module>()
-> val = sum(val, 1)
(Pdb) s
> /Users/bbelderb/code/
-> def sum(val1, val2):
(Pdb) s
> /Users/bbelderb/code/
-> val2 = 0
(Pdb) s
> /Users/bbelderb/code/
-> newval = val1 + val2
(Pdb) val1
(Pdb) val2
(Pdb) n
> /Users/bbelderb/code/
-> return newval
(Pdb) newval

It is obvious that val2 gets explicitly set to 0, but if it was less obvious inspecting the variables might be all you need.

Real World example

Another example I found on SO. I shortened the code a bit to keep it simple:

from urllib.request import urlopen
from xml.etree.ElementTree import parse

def getbuses():
    u = urlopen('')
    data =
    f = open('rt22.xml', 'wb')
    doc = parse('rt22.xml')
    running_buses = {}
    for bus in doc.findall('bus'):
        idbus = int(bus.findtext('id'))
        lat = float(bus.findtext('lat'))
        lon = float(bus.findtext('lon'))
        direction = str(bus.findtext('d'))
        running_buses[idbus] = {lat, lon, direction}
    return running_buses

def print_routes(running_buses):
    print('Running buses on route 22:\n')
    for b, (lat, lon, direction) in running_buses.items():
        print('Bus number: {}'.format(b))
        print('- Latitude: {}'.format(lat))
        print('- Longitude: {}'.format(lon))
        print('- Direction: {}'.format(direction))

def main():
    running_buses = getbuses()

if __name__ == '__main__':

The leads to weird results:

Bus number: 1906
- Latitude: 41.9041748046875
- Longitude: -87.63142395019531
- Direction: North Bound
Bus number: 1932
- Latitude: 41.968283335367836
- Longitude: South Bound
- Direction: -87.66738806830512
Bus number: 1910
- Latitude: -87.67295837402344
- Longitude: 42.01838684082031
- Direction: North Bound

This is again a pretty simple use case for pdb:

> /Users/bbelderb/code/
-> for bus in doc.findall('bus'):
(Pdb) n
> /Users/bbelderb/code/
-> running_buses[idbus] = {lat, lon, direction}
(Pdb) idbus
(Pdb) lat
(Pdb) lon
(Pdb) direction
'North Bound'
(Pdb) n

The variables seem correct, but if I print the data structure I see they appear in a different order:

> /Users/bbelderb/code/
-> for bus in doc.findall('bus'):
(Pdb) running_buses
{1906: {'North Bound', 41.90836715698242, -87.63148498535156}}

As explained in the SO thread it's because of the use of set instead of a tuple, former does not keep order.

I realize this example does not show much pdb magic so maybe if we do A. a code challenge where you use it upon your next debugging exercise or B. record a video when we are hunting down a nasty bug ourselves. To be continued ...

learning to debug with pdb is an essential Python developer skill

Conclusion and resources

As you can see this is an essential skill for any developer. Print and unittest can get you far, but moment inevitably comes you have to catch bugs in the act.

For more info check out the docs or Doug Hellmann's PyMOTW series which has a very extensive coverage of pdb.

But that might be a lot of reading. You can also get a concise overview watching Clayton Parker's PyCon talk: So you think you can PDB? It shows a lot of good examples and it peaked my interest to try out pdb++.

Keep Calm and Code in Python!

-- Bob

