Difference between revisions of "Twitter Follower Finder (Tool)"
Line 1: | Line 1: | ||
− | + | ||
− | |||
== '''People to Follow Crawl''' == | == '''People to Follow Crawl''' == | ||
Revision as of 17:12, 6 October 2016
People to Follow Crawl
Description
This crawler takes as input the twitter handle of a person we think posts similar content to us or is an account we admire.
It completes the following steps to use their information to find people we should consider following: 1) Crawls the tweets of that user and notes, for each tweet, how many times a buzzword (entrepreneur((s)hip), research(ers), innovat(e)(ion)) appears 2) Composes a list of the best tweets (most buzzwords) produced by the account in it's most recent 50 tweets. 3) Crawls the people who retweeted the tweets with the most buzzwords. 4) Makes note of how many times a buzzwords was used, for each of the retweeters. 5) Outputs a csv file which gives the usernames and a score (number of buzzwords) for each of the users.
Development
Functions authenticationAndAccess_interface, jsonDataAcquisition, retweetersIdAcquisition, retweetersShortnameAcquisition and generate_pandas_table_filledWithZeroes were taken from http://mcnair.bakerinstitute.org/wiki/Twitter_Webcrawler_(Tool) aka Gunny's existing Twitter Crawler.
I had major issues with Rate Limits and eventually found a solution here: http://python-twitter.readthedocs.io/en/latest/rate_limits.html and read about the Rate Limit nonsense in general here: https://dev.twitter.com/rest/public/rate-limiting
I used this page http://python-twitter.readthedocs.io/en/latest/twitter.html#module-twitter.api for an examination of all the methods I can use(and probably will use after I finish this application of the crawler)