Code Challenge Pilot - calculate total time JS course

Julian, Wed 04 January 2017, Challenges

beginners, code, codechallenges, learning, python

Bob and I thought it'd be interesting to do some code challenges. That is, Bob specifies the challenge and I complete it. Bob then goes through my code and makes any necessary edits/improvements to make it more Pythonic.

This will not only improve my Python and his code review skills but should also (hopefully!) provide you with something interesting or at least entertaining, to read.

Feel free to give any feedback or improvements of your own in the comments below!

The Challenge

Bob discovered a free, online Javascript Course that he felt would be useful to us. On creating an account you're faced with the below course content list.

JS Course Content Listing

The problem is that while each module/video displays its own duration, there's no course total time listed anywhere.

Enter the Challenge: Create a web scraper that parses the page and then calculates the total course time.

(My final code can be located here). Update: our code review is here.

Limitations and Complications

  1. The main content page is behind a login. How the heck was I supposed to automate a scraper to log into the site with my creds and then pull the page?

  2. I manually right-clicked and selected 'Save As' (on Windows) to save the page as an html file but when I tried to parse the file with BeautifulSoup I consistently hit an error.

The Setup

I initially wanted to use BeautifulSoup for this but as I kept hitting the aforementioned error and was running out of time (sleep!) I decided to keep it simple, albeit a little manual.

Key Moments and Challenges

#Read in the HTML file and search it using my time regex
def search_file(file)
#Strip out the brackets and the colon to calculate the mins and seconds
def time_calculation(durations)
time_regex = re.compile(r'\(\d+:\d+\)') #Creating the regex
#For loop to strip brackets/colon and assign the mins/seconds
for i in range(len(durations)):
    minutes, seconds = durations[i].strip('()').split(':')

Result

The program eventually worked! I was able to calculate that the course took roughly 6.8hrs to complete.

Thoughts and Changes

Conclusion and Next Step

As annoyed as I got at certain points, I actually enjoyed this. Problem wise it's as simple as they come but it forced me to revisit the basics of regex and string manipulation.

As I write this I'm getting github commit notifications of Bob refactoring and commenting so I know he's hard at work making my code as Pythonic as possible. Tomorrow's post will be his feedback... go easy on me brother!

(Again, find my code for this challenge here. Update: our code review is here).

Keep Calm and Code in Python!

-- Julian

PyBites Python Tips

Do you want to get 250+ concise and applicable Python tips in an ebook that will cost you less than 10 bucks (future updates included), check it out here.

Get our Python Tips Book

"The discussions are succinct yet thorough enough to give you a solid grasp of the particular problem. I just wish I would have had this book when I started learning Python." - Daniel H

"Bob and Julian are the masters at aggregating these small snippets of code that can really make certain aspects of coding easier." - Jesse B

"This is now my favourite first Python go-to reference." - Anthony L

"Do you ever go on one of those cooking websites for a recipe and have to scroll for what feels like an eternity to get to the ingredients and the 4 steps the recipe actually takes? This is the opposite of that." - Sergio S

Get the book