How To Create A Nice-looking HTML Page Of Your Kindle Book Highlights (notes)

Kindle notes

I was looking at an effective way to organize my Kindle highlights. I started looking at parsing the Kindle’s My Clippings.txt file. However I had not much luck with existing PyPi modules and it is a bit cumbersome to always have to manually copy it via USB cable.

Starting point: Cloud + Bookcision

Then I found a much better starting point: https://kindle.amazon.com = cloud. OK, this only works for Kindle purchased books, but using Amazon’s Whispersync really makes this convenient. Also, the Kindle site lets you filter / adjust your highlights and notes before exporting.

For export I use the nice Bookcision JS bookmarklet which – when used in Chrome – gives you the ability to dowload the highlights JSON format.

JSON => HTML

I wrote a script to convert the Bookcision JSON download into a static HTML page (for blog use, inspired by Sivers).

Code is here.

Some things to note:

Use json.loads(fh) to convert JSON into dict:

def load_json(json_file):
    with open(json_file) as f:
        return json.loads(f.read())

Template strings: in templates.py PAGE defines the whole page, I use embedded CSS to make this a standalone solution. QUOTE defines a list item (highlight). Variables are defined with $ so: $title, $author, etc. In the main script I can substitute these variable placeholders with a dict:

def get_highlights(highlights):
    for hl in highlights:
        yield QUOTE.safe_substitute({
            'text' : hl['text'],
            'note' : ' / note: ' + hl['note'] if hl['note'] else '',
            'url' : hl['location']['url'],
            'location': hl['location']['value'],
        })

Note the ‘yield’ makes get_highlights() a generator. If this is new, check out this SO thread about Iterables -> Generators -> Yield [1]

Use list() to consume all generator’s values in one go:

highlights = get_highlights(content['highlights'])
...
...
    'content': '\n'.join(list(highlights)),

You can give the script one or more JSON files simply by using a slice on sys.argv:
```
for json_file in sys.argv[1:]:
    ...
```

So you can batch process JSON downloads:

$ ls *json
anything-you-want.json  arnold.json     choose-yourself.json    the-circle.json

$ python kindle_json2html.py *json
anything-you-want.html created
arnold.html created
choose-yourself.html created
the-circle.html created

Example

Here is what an output looks like:

resulting html page

As the HTML contains everything you can just copy it to your blog, example.

Keep Calm and Code in Python!

— Bob

[1] Generators save memory by not materializing the values of an iterable in memory = better performance. Here we don’t really need that, yet I stil find the yield syntax more elegant (it’s shorter) than building and returning a local collection (list).

How to create a nice-looking HTML page of your Kindle book highlights (notes)

By Bob Belderbos on 26 December 2016

Kindle notes

Starting point: Cloud + Bookcision

JSON => HTML

Example

Kindle notes

Starting point: Cloud + Bookcision

JSON => HTML

Example

Related articles