Introduction
As a Python developer I want to stay up2date with trends and useful tips & tricks.
Of course there are great newsletters like Pycoders, but those are already hitting my inbox.
Let’s look at Planet Python in this article, an aggregation site/ service that indexes a lot of good Python blog feeds. Keeping an eye on that resource will be useful.
In this article we’ll build a tool to parse this resource and get a daily email with new articles.
We will use Sendgrid for the emailing and GitHub Actions to run it automatically.
Planet Python parse and email script
Here is the script I came up with:
from datetime import datetime, timedelta, UTC
from typing import NamedTuple
import feedparser
from dateutil.parser import parse
from decouple import config
from sendgrid import SendGridAPIClient
from sendgrid.helpers.mail import Mail
SENDGRID_API_KEY = config("SENDGRID_API_KEY")
FROM_EMAIL = config("FROM_EMAIL")
TO_EMAIL = config("TO_EMAIL")
PLANET_PYTHON_FEED = "https://planetpython.org/rss20.xml"
ONE_DAY = 1
class Article(NamedTuple):
title: str
link: str
publish_date: str
def fetch_articles() -> list[Article]:
feed = feedparser.parse(PLANET_PYTHON_FEED)
return [
Article(entry.title, entry.link, entry.published)
for entry in feed.entries
]
def filter_recent_articles(
articles: list[Article], days: int = ONE_DAY
) -> list[Article]:
recent_articles = []
now = datetime.now(UTC)
for article in articles:
publish_date = parse(article.publish_date)
if now - publish_date <= timedelta(days):
recent_articles.append(article)
return recent_articles
def send_email(
from_email: str, to_email: str, subject: str, content: str
) -> None:
message = Mail(
from_email=from_email,
to_emails=to_email,
subject=subject,
html_content=content,
)
try:
sg = SendGridAPIClient(SENDGRID_API_KEY)
response = sg.send(message)
print(f"Email sent with status code: {response.status_code}")
except Exception as e:
print(e)
def main() -> None:
articles = fetch_articles()
recent_articles = filter_recent_articles(articles)
if len(recent_articles) == 0:
print("No new articles found")
return
subject = "New Planet Python articles"
def _create_link(article: Article) -> str:
return f"<a href='{article.link}'>{article.title}</a>"
body = "<br>".join(
[_create_link(article) for article in recent_articles]
)
send_email(FROM_EMAIL, TO_EMAIL, subject, body)
if __name__ == "__main__":
main()
Explanation:
Environment Configuration: I use decouple
(python-decouple package, related article here) to securely manage environment variables, including the SendGrid API key and email addresses for both sender and receiver.
RSS Feed Parsing: I use feedparser
to fetch and parse the RSS feed from Planet Python, extracting articles’ titles, links, and published dates.
Article Data Structure: I use a typed NamedTuple
, Article
, to store essential information about each article, including its title, link, and publish date.
Fetching Recent Articles: I created a function fetch_articles
that retrieves all articles from the RSS feed and returns a list of Article
instances.
Filtering by Publish Date: The filter_recent_articles
function filters articles to include only those published within the last day (or a specified number of days), using dateutil.parser
to handle date parsing and datetime
for date comparison. Note that I later learned that you can use datetime.UTC
instead of ZoneInfo
, saving the import.
Email Preparation and Sending: Leverages the SendGrid API through sendgrid
library to compose and send an HTML email. The email contains links to recent articles, formatted using a helper function _create_link
.
Dynamic Email Content: I create the email body dynamically by converting the list of recent articles into HTML links.
By the way, my first go was text only, but that looked ugly:
Error Handling in Email Sending: Includes try-except blocks around the email sending process to catch and print any errors, enhancing reliability and debugging ease.
While our script employs a broad Exception
catch for simplicity and broad coverage, understanding and implementing more granular exception handling can significantly enhance your application’s robustness and debuggability. In Python, different exceptions can be raised for various reasons, such as ValueError
for an invalid parameter, or IOError
for issues accessing a file. By catching exceptions more specifically, you can provide more detailed error messages, recover gracefully from certain errors, or even retry failed operations under specific conditions. For those interested in diving deeper into this topic, the Python documentation on Built-in Exceptions offers a comprehensive overview of the different exception types available. You can also check out our 7 Tips to Improve Your Error Handling in Python article.
Main Workflow Execution: The main
function orchestrates the script’s workflow: fetching articles, filtering for recency, composing, and sending the email notification if new articles are found.
Conditional Execution Check: Before sending an email, checks if there are any new articles. If none are found, it prints a message and exits, avoiding unnecessary emails (while the printed message will still end up in the GitHub Action logs).
Hooking it up with a GitHub Action
Here is the GitHub workflow I created in .github/workflows
(the required target directory structure for GitHub Actions):
name: Send Email Notification
on:
schedule:
- cron: '0 7 * * *' # 9 AM CET
workflow_dispatch: # also enable on button click
jobs:
send_email:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.11'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Send email
env:
SENDGRID_API_KEY: ${{ secrets.SENDGRID_API_KEY }}
FROM_EMAIL: ${{ secrets.FROM_EMAIL }}
TO_EMAIL: ${{ secrets.TO_EMAIL }}
run: python script.py
Explanation:
Scheduled Trigger: The workflow is configured to run automatically at a specific time—7:00 AM UTC (which corresponds to 9:00 AM CET)—every day, using the cron syntax in the schedule
event. This ensures the email notifications are sent out consistently without manual intervention.
Manual Trigger Option: Besides the scheduled trigger, the workflow can also be manually triggered using the workflow_dispatch
event. This flexibility allows users to run the workflow at any time, independent of the scheduled time, by clicking a button in the GitHub Actions interface.
Environment Setup: The workflow runs on the latest Ubuntu runner provided by GitHub Actions (ubuntu-latest
). It begins by checking out the repository code using actions/checkout@v2
and then sets up the specified version of Python (3.11) with actions/setup-python@v2
, preparing the environment for script execution.
Dependency Installation: To ensure all required Python libraries are available, the workflow includes a step to upgrade pip
and install dependencies from the requirements.txt
file. This step is crucial for the Python script to run successfully, as it relies on external libraries such as feedparser
, python-decouple
, and sendgrid
.
Email Sending Execution: The final step of the workflow executes the Python script (script.py
) that sends out the email notifications.
This step securely accesses the SendGrid API key and email addresses (for both sender and receiver) from GitHub Secrets (secrets.SENDGRID_API_KEY
, secrets.FROM_EMAIL
, and secrets.TO_EMAIL
), ensuring sensitive information is kept secure and not exposed in the repository.
See GitHub’s documentation how to work with secrets. So locally I work with a hidden .env
file, remotely I use secrets.
Result
And voilà: this was the email I received this morning with the Planet Python entries of the last 24 hours. 🎉
Conclusion
In this article I’ve showed you a way to parse Planet Python entries, filter on the latest ones (last 24 hours), and email the results, hooking it up with a GitHub Actions.
Not only will this action run once a day via its cronjob feature, thanks to workflow_dispatch
you can also run it manually from the Action tab on the GitHub repo’s page. 😍
Through this process we’ve learned about the feedparser
, python-decouple
, and sendgrid
Python libraries and how to manage environment config variables, both locally and on GitHub using its Secrets feature.
I hope this helps you automate more tasks and leverage GitHub Actions. 📈
Next steps
Check out the repo here. Feel free to contribute by opening issues and/or submitting code.
I am also happy to hear your thoughts in our community.