This week, each one of you has a homework assignment ... - Tyler Durden (Fight club)
A new week, more coding! In Part 2 of our Twitter data analysis we challenge you to find out how similar two tweeters are ...
Make a script that receives two command line args: user1 and user2
$ similar_tweeters.py bbelderbos pybites # ... some index of similarity ...
Get the last n tweets of these users. You can use the code of Part 1.
Tokenize the words in the tweets, filtering out stop words, URLs, digits, punctuation, words that only occur once or are less than 3 characters (and/or other noise ...)
Extract the main subjects the users tweet about. You could use Gensim, an NLP package for Topic Modeling. However feel free to take your own approach! We are dropping the helper template and external libs (requirements.txt) for this challenge, we'd love to see different approaches to this problem ...
Compare the subjects and come up with a similarity score.
Start coding by forking our challenges repo:
$ git clone https://github.com/pybites/challenges
If you already forked it sync it:
# assuming using ssh key $ git remote add upstream firstname.lastname@example.org:pybites/challenges.git $ git fetch upstream # if not on master: $ git checkout master $ git merge upstream/master # ... no helper template for this challenge ...
Remember: there is no best solution, only learning more Python.
Enjoy and we're looking forward reviewing our and your solutions on Friday.
More background in our first challenge article.
Do you want to get 250+ concise and applicable Python tips in an ebook that will cost you less than 10 bucks (future updates included), check it out here.
"The discussions are succinct yet thorough enough to give you a solid grasp of the particular problem. I just wish I would have had this book when I started learning Python." - Daniel H
"Bob and Julian are the masters at aggregating these small snippets of code that can really make certain aspects of coding easier." - Jesse B
"This is now my favourite first Python go-to reference." - Anthony L
"Do you ever go on one of those cooking websites for a recipe and have to scroll for what feels like an eternity to get to the ingredients and the 4 steps the recipe actually takes? This is the opposite of that." - Sergio S