Exploring the Modern Python Command-Line Interface

By on 27 April 2020

Exploring the Modern Python Command-Line Interface

The goal here is simple: help the new Python developer with some of
the history and terminology around command-line interfaces (CLIs)
and explore how we write these useful programs in Python.

In the Beginning…

First, a Unix persepective on command-line interface design.

Unix is a computer operating system and is the ancestor of Linux
and MacOS (and many other operating systems as well). Before
graphical user interfaces, the user interacted with the computer
via a command-line prompt (think of today’s bash
environment). The primary language for developing these programs
under Unix is C, which has amazing power for both good and
evil.

 "C gets sh*t done."

     - a handsome and yet strangely anonymous C programmer

So it behooves us to at least understand the basics of a C program .

Assuming you didn’t read that, the basic architecture of a C program
is a function called main whose signature looks like:

   int main(int argc, char **argv)
   {
   ...
   }

This shouldn’t look too strange to a Python programmer. C functions
have a return type first, a function name, and then the typed
arguments inside the parenthesis. Lastly, the body of the function
resides between the curly braces. The function name main is
how the runtime linker (the program that constructs and runs
programs) decides where to start executing your program. If you
write a C program and it doesn’t include a function named
main, it will not do anything. Sad.

The function argument variables argc and argv together describe
a list of strings which were typed by the user on the command-line
when the program was invoked. In typical terse Unix naming
tradition, argc means argument count and argv means argument
vector
. Vector sounds cooler than list and argl would have
sounded like a strangled cry from help. We are Unix system
programmers and we do not cry for “help”. We make other people cry
for help.

Moving On

$ ./myprog foo bar -x baz

If myprog is implemented in C, argc will have the value 5 and
argv will be an array of pointers to characters with five entries
(don’t worry if that sounds super-technical, it’s a list of five
strings). The first entry in the vector, argv[0], will be the
name of the program. The rest of argv will contain the arguments:

   argv[0] == "./myprog"
   argv[1] == "foo"
   argv[2] == "bar"
   argv[3] == "-x"
   argv[4] == "-baz"

   /* Note: not valid C */

In C, we have many choices to handle the strings in argv. We could
loop over the array argv manually and interpret each of the
strings according to the needs of the program. This is relatively
easy, but leads to programs with wildly different interfaces as
different programmers have different ideas about what is “good”.

include <stdio.h>

/* A simple C program that prints the contents of argv */

int main(int argc, char **argv) {
    int i;

    for(i=0; i<argc; i++)
      printf("%s\n", argv[i]);
}


Early Attempts to Standardize the Command-Line

The next weapon in the command-line arsenal is a C standard
library
function called getopt. This function allows the
programmer to parse switches, arguments with a dash preceeding it like
-x and optionally pair follow-on arguments with their
switches. Think about command invocations like /bin/ls -alSh,
getopt is the function originally used to parse that argument
string. Using getopt makes parsing the command-line pretty easy and
improves the user experience (UX).

#include 
#include 

#define OPTSTR "b:f:"

extern char *optarg;

int main(int argc, char **argv) {
    int opt;
    char *bar = NULL;
    char *foo = NULL;

    while((opt=getopt(argc, argv, OPTSTR)) != EOF)
       switch(opt) {
          case 'b':
              bar = optarg;
              break;
          case 'f':
              foo = optarg;
              break;
          case 'h':
          default':
              fprintf(stderr, "Huh? try again.");
              exit(-1);
              /* NOTREACHED */
       }
    printf("%sn", foo ? foo : "Empty foo");
    printf("%sn", bar ? bar : "Empty bar");
}

On a personal note, I wish Python had switches but that will
never happen.

The GNU Generation

The GNU project came along and introduced longer format
arguments for their implementations of traditional Unix
command-line tools, things like --file-format foo. Of course we
real Unix programmers hated that because it was too much to type,
but like the dinosaurs we are, we lost because the users liked
the longer options. I never wrote any code using the GNU-style
option parsing, so no code example.

GNU-style arguments also accepted short names like -f foo that
had to be supported too. All of this choice resulted in more
workload for the programmer who just wanted to know what the user
was asking for and get on with it. But the user got an even more
consistent UX; long and short format options and automatically
generated help that often kept the user from attempting to read
infamously difficult-to-parse manual pages (see ps for
a particularly egregious example).

But We’re Talking About Python?

You have now been exposed to enough (too much?) command-line
history to have some context about how to approach CLIs written
with our favorite language. Python gives us a similar number of
choices for command-line parsing; do it yourself, a
batteries-included option and a plethora of third-party
options. Which one you choose depends on your particular
circumstances and needs.

First, Do It Yourself

We can get our program’s arguments from the sys
module.

import sys

if __name__ == '__main__':
   for value in sys.argv:
       print(value)

You can see the C heritage in this short program. There’s a reference
to main and argv. The name argc is missing since the Python list
class incorporates the concept of length (or count) internally. If
you are writing a quick throw-away script, this is definitely your
go-to move.

Batteries Included

There have been several implementations of argument parsing modules
in the Python standard library; getopt, optparse, and
most recently argparse. Argparse allows the programmer to
provide the user with a consistent and helpful UX, but like it’s
GNU antecedents it takes a lot of work and ‘boilerplate code’
on the part of the programmer to make it “good”.

from argparse import ArgumentParser

if __name__ == '__main__':

   argparser = ArgumentParser(description='My Cool Program')
   argparser.add_argument('--foo', '-f', help='A user supplied foo')
   argparser.add_argument('--bar', '-b', help='A user supplied bar')

   results = argparser.parse_args()
   print(results.foo, results.bar)

The payoff for the user is automatically generated help available
when the user invokes the program with --help. But what about the
advantage of batteries included? Sometimes the circumstances
of your project dictate that you have limited or no access to
third-party libraries, and you have to “make do” with the Python
standard library.

A Modern Approach to CLIs

And then there was Click. TheClick framework uses a
decorator approach to building command-line parsing. All of
the sudden it’s fun and easy to write a rich command-line
interface. Much of the complexity melts away under the cool and
futuristic use of decorators and users marvel at the automatic
support for keyword completion as well as contextual help. All
while writing less code than previous solutions. Any time you can
write less code and still get things done is a “win”. And we all
want “wins”.

import click

@click.command()
@click.option('-f', '--foo', default='foo', help='User supplied foo.')
@click.option('-b', '--bar', default='bar', help='User supplied bar.')
def echo(foo, bar):
    """My Cool Program

    It does stuff. Here is the documentation for it.
    """
    print(foo, bar)

if __name__ == '__main__':
    echo()

You can see some of the same boilerplate code in the @click.option
decorator as you saw with argparse. But the “work” of creating and
managing the argument parser has been abstracted away. Now our function
echo is called magically with the command-line arguments parsed and
the values assigned to the function arguments.

Adding arguments to a click interface is as easy as adding another
decorator to the stack and adding the new argument to the function
definition.

But Wait, There’s More!

Built on top of Click, Typer is an even newer CLI
framework which combines the functionality of Click with modern
Python type hinting. One of the drawbacks of using Click is
the stack of decorators that have to be added to a function. CLI
arguments have to be specified in two places; the decorator and the
function argument list. Typer DRYs out CLI specifications,
resulting in code that’s easier to read and maintain.

import typer

app = typer.Typer()

@app.command()
def echo(foo: str = 'foo', bar: str = 'bar'):
    """My Cool Program

    It does stuff. Here is the documentation for it.
    """
    print(foo, bar)

if __name__ == '__main__':
    app()

Time to Start Writing Some Code

Which one of these approaches is right? It depends on your use
case. Are you writing a quick and dirty script that only you will
use? Use sys.argv directly and drive on. Do you need more
robust command-line parsing? Maybe argparse is enough. Maybe you
have lots of subcommands and complicated options and your team is
going to use it daily? Now you should definitely consider Click
or Typer. Part of the fun of being a programmer is hacking out
alternate implementations to see which one suits you best.

Finally, there are many third-party packages for parsing
command-line arguments in Python. I’ve only presented the ones I like
or have used. It is entirely fine and expected for you to like and/or use
different packages. My advice is to start with these and see where you
end up.

Go write something cool.

Erik

(Cover photo by Dylan McLeod on Unsplash)

Want a career as a Python Developer but not sure where to start?