I was working on one of the exercises on the Pybites platform (Bite 29) and encountered a situation I didn’t understand. I needed to check a set of inputs to see if they were alphanumeric or not as part of the solution to the exercise. I succeeded in all but one test, but I couldn’t tell why one failed so I researched why that was and would like to share it.
Note – if you haven’t completed the Bite, the solution might be partially spoiled below so maybe you want to do the exercise before reading on.
Membership testing is a core function of Python, but sometimes what we might expect isn’t what we get, and it’s important to understand why that is. Let’s take a look at the membership test operations in the context of characters and strings.
The membership test operations (as the name implies) test for the membership of an element within another element, and evaluate to either true or false. For example, consider the expression x in s
. It returns True
if x is present in s, otherwise it returns False
.
I was testing whether characters char
were alphanumeric by checking for their membership in a string of all the alphanumeric characters alpha
(so, a-z+A-Z+0-9
) and encountered the issue that – although I didn’t intend for it to – an empty string’s membership was evaluating to True
.
>>> char = ''
>>> alpha = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
>>> alpha
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
>>> char in alpha
True
This is because the membership test is checking if the character char
is a substring of alpha
, and according to the documentation an empty string is always considered to be a substring of any other string.
Instead, what I intended to test was the membership of an element char
against a collection of individual elements considered alphanumeric, and for that, Python has a number of solutions. For my case though, I would like to use the set given in the exercise. The solution* is to list-ify that string of characters, then test for the membership of char
against that list of elements, like so:
>>> char = ''
>>> alpha = list('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789')
>>> alpha
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B',
'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P',
'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '0', '1', '2', '3',
'4', '5', '6', '7', '8', '9']
>>> char in alpha
False
*Editor’s note: Sets are more performant than lists for lookups because they support hashing. If you’re interested in reading more about this, you can start here.
I still have much to learn in Python, but the Pybites platform has already been a great educational resource for me, and I encourage anyone who is interested in developing their Python skills to try it out!