Generating Beautiful Code Snippets with Carbon and Selenium

Bob, Tue 26 February 2019, Tools

automation, BeautifulSoup, carbon, collections, pprint, random, requests, Selenium, tips, urllib

Did you notice our Python tips lately? They looks more sexy, don't they? That's thanks to Carbon which lets you create beautiful images of your source code. As much as I love its interface though, what if we can automate this process generating the image for us?

That's what we did and posting new tips to Twitter is now a breeze. In this article I will show you how using a bit of BeautifulSoup and selenium. Enjoy!

Getting Ready

If you want to follow along, the final code is here.

First make a virtual environment and pip install the requirements. Also make sure you have the chromedriver in your PATH.

[[email protected] code]$ mkdir carbon && cd $_
[[email protected] carbon]$ alias pvenv
alias pvenv='/Library/Frameworks/Python.framework/Versions/3.7/bin/python3.7 -m venv venv && source venv/bin/activate'
[[email protected] carbon]$ pvenv
(venv) [[email protected] carbon]$ python -V
Python 3.7.0
(venv) [[email protected] carbon]$ pip install -r requirements.txt
Successfully installed beautifulsoup4-4.7.1 bs4-0.0.1 certifi-2018.11.29 chardet-3.0.4 idna-2.8 requests-2.21.0 selenium-3.141.0 soupsieve-1.8 urllib3-1.24.1
(venv) [[email protected] carbon]$ ls ~/bin/chromedriver

Introducing our Platform Tips

At the time of this writing we have 92 tips on our platform:

pybites tips page

Let's inspect the tip html we need to parse with BeautifulSoup:

html of each tip

Each tip is wrapped in a tr where the tip text is in a blockquote and the code in pre tags.

We also want to know if a tip has already been shared out by inspecting the Twitter link. In this example it has pybites/status which means we did.

check if there is a twitter link href

By the way, I didn't realize it at the time of coding this, but we did make an API GET endpoint some time ago. If there is an API it's preferred to use that to retrieve data. However knowing BeautifulSoup might come in handy too :)

I will do a follow-up post how to convert this into a Tips API ...

Bootstrapping the script

Before scraping the tips page, let's define the overall structure of the script:

from collections import namedtuple
from random import choice
import sys
from time import sleep
import urllib.parse

from bs4 import BeautifulSoup
import requests
from selenium import webdriver

PYBITES_HAS_TWEETED = 'pybites/status'
CARBON = '{code}'
TWEET_BTN_CLASS = 'jsx-2739697134'
TWEET = '''{tip} {src}

🐍 Check out more @pybites tips at 💡

(image built with @carbon_app)


Tip = namedtuple('Tip', 'tip code src')

def retrieve_tips():
    """Grab and parse all tips from
    returning a dict of keys: tip IDs and values: Tip namedtuples

def get_carbon_image(tip):
    """Visit with the code, click the Tweet button
    and grab and return the Twitter picture url

if __name__ == '__main__':
    tips = retrieve_tips()
    if len(sys.argv) == 2:
        tip_id = int(sys.argv[1])
        tip_id = choice(list(tips.keys()))

    tip = tips.get(tip_id)
    if tip is None:
        print(f'Could not retrieve tip ID {tip_id}')

    src = tip.src and f' - see {tip.src}' or ''
    img = get_carbon_image(tip)

    tweet = TWEET.format(tip=tip.tip, src=src, img=img)

OK step by step:

Parsing the tips

At this point the code at best will throw an AttributeError, because our tips dict is empty. So let's write retrieve_tips to populate it:

def retrieve_tips():
    """Grab and parse all tips from
    returning a dict of keys: tip IDs and values: Tip namedtuples

Firt we need to retrieve the page with requests:

    html = requests.get(TIPS_PAGE)

We then instantiate a BeautifulSoup object passing it in the response text and parser:

    soup = BeautifulSoup(html.text, 'html.parser')

As we saw all the tips are in a table, each one in a table row or tr, so let's get all of them:

    trs = soup.findAll("tr")

Next let's use a data structure to store the tips. At first I used a list but later I wanted to index by tip ID, so a dict turned out to be more appropriate:

    tips = {}

Next let's loop through the rows, creating Tip namedtuples and adding them to our tips dict:

    for tr in trs:
        tds = tr.find_all("td")
        id_ = int(tds[0].text.strip().rstrip('.'))
        tip_html = tds[1]

        links = tip_html.findAll("a", class_="left")
        share_link = links[0].attrs.get('href')

        pre = tip_html.find("pre")
        code = pre and pre.text or ''

        # skip if tweeted or not code in tip
        if PYBITES_HAS_TWEETED in share_link or not code:

        tip = tip_html.find("blockquote").text
        src = len(links) > 1 and links[1].attrs.get('href') or ''

        tips[id_] = Tip(tip, code, src)

Step by step:

Lastly we return the tips dict:

    return tips

Let's see if this works using pprint:

adding a pprint

Running this it outputs:

getting a tips dict

Great. Let's make a carbon image next.

Beautiful images of your source code

Meet carbon:

carbon home

It allows you to add code, choose a language and configure other settings, then generate the image and/or tweet it out. It's really nice!

While playing with the interface I found that clicking the Tweet button it would generate a shareable picture hosted on Twitter. For example clicking the Tweet button I get this popup:

clicking tweet button

And that Twitter link shows the generated code snippet image:

resulting carbon image

We can automate this using Selenium to click the Tweet button, capturing the generated image link. This is why I defined the TWEET_BTN_CLASS constant which is the class set on this button.

Use Selenium to create tip code image

Let's write the second function get_carbon_image:

def get_carbon_image(tip):
    """Visit with the code, click the Tweet button
    and grab and return the Twitter picture url

First we need to encode (replace special characters) the tip code snippet. quote_plus (from urllib.parse) also replaces spaces by plus signs, as required for quoting HTML form values when building up a query string to go into a URL (see docs).

    code = urllib.parse.quote_plus(tip.code)

With that done we define the full url:

    url = CARBON.format(code=code)

We then start the Chromedriver. Unlike last time I am not going to use headless mode here, because I'd actually like to see what Selenium is doing:

    driver = webdriver.Chrome()

Here we locate mentioned TWEET_BTN_CLASS (jsx-2739697134) button and click on it:


Trial and error taught me that this might take a bit so I use sleep:


Retrieve the image from popup

And here is the tricky part. The Tweet button opened a popup but the driver is still on the main browser page window (see seconds 15 and 41 of the demo below).

You can toggle windows though using driver.switch_to.window:

    window_handles = driver.window_handles

Now I am on the Twitter popup window and I can target the status ID field and grab the image URL from it:

    status = driver.find_element_by_id('status')
    img = status.text.split(' ')[-1]

Finally I quit the driver (this closes the browser) and return the image string:

    return img

See it in action

Here you can see this automation script in action, generating an image from a random tip as well as when specifying a specific ID:

And voilà: two new tips I could post to our Twitter (here and here).

Note that after manually tweeting it out as @pybites, we set the obtained tweet URL on the tip (in the DB) so it's not selected upon next run (the if PYBITES_HAS_TWEETED in share_link check above).

Room for improvement

Here are some things we can do to take it to the next level:

  1. Auto-post the tweet to Twitter (we already have the code for this).
  2. Make a Tips API (pending article):
    • allow GET to retrieve a tip and POST to receive new ones,
    • do a PUT request on the tip with the tweet link after running this script and/or auto-posting to Twitter (1.)

Feel free to PR any of this here.

I hope you enjoyed this and it inspired you to build your own automation scripts. To learn how to run Selenium in headless mode on Heroku, check out our article from last week.

Feel free to share more Python tips on our platform.

Question: what would you like us to write about more? You can drop us an email or brainstorm with us and our amazing community on our Slack. We do accept guest posts!

Keep Calm and Code in Python!

-- Bob

PyBites Python Tips

Do you want to get 250+ concise and applicable Python tips in an ebook that will cost you less than 10 bucks (future updates included), check it out here.

Get our Python Tips Book

"The discussions are succinct yet thorough enough to give you a solid grasp of the particular problem. I just wish I would have had this book when I started learning Python." - Daniel H

"Bob and Julian are the masters at aggregating these small snippets of code that can really make certain aspects of coding easier." - Jesse B

"This is now my favourite first Python go-to reference." - Anthony L

"Do you ever go on one of those cooking websites for a recipe and have to scroll for what feels like an eternity to get to the ingredients and the 4 steps the recipe actually takes? This is the opposite of that." - Sergio S

Get the book