Prep work
We built this blog in Pelican, adding this in pelicanconf.py adds an RSS feed:
FEED_RSS = ‘feeds/all.rss.xml’
And voila, after pushing this change we have our RSS feed.
Script (use PyPi!)
The script is on github in our new blog repo.
No need to re-invent the wheel, PyPI (Python Package Index) has so much good stuff, feedparser is just what we need. It can take both a remote as well as local xml file, so you don’t even need requests.
This single line parses the feed into a comprehensive data structure:
feed = feedparser.parse(xml)
Which you can then easily consume:
for article in feed['entries']:
# ... filtering
yield article
The only thing I had to add was some timestamp conversations/calculations to go x days back (the returned feed data has a convenient time.struct_time field).
Mail digest as txt/html in a cronjob
I left this for sendmail which accepts a mailheader, see here. So this is my weekly cronjob:
# html email
0 7 * * 6 cat pybites_header <(python3 /path/to/pybites_digest/digest.py 7 1) | sendmail -t
# text version for copy+paste into social media (no need to cat header file)
10 7 * * 6 python3 /path/to/pybites_digest/digest.py 7 | mailx -s "Weekly PyBites digest (txt ed)"
-
First arg is “days back” = 7 = one week / 2nd arg = html True
-
You might need to do a export PYTHONPATH=/path/to/python3.x/site-packages if you installed Python3 in your $HOME on a shared hosting provider.
-
The ‘<()' syntax is a nice way in Unix to join in output from a subprocess.
Don’t miss any post
If you want to receive these weekly digests please subscribe to our blog or join our FB group.