Difference between revisions of "Twitter Follower Finder (Tool)"

From edegan.com
Jump to navigation Jump to search
Line 3: Line 3:
  
 
== Description ==
 
== Description ==
This crawler takes as input the twitter handle of a person we think posts similar content to us or is an account we admire.  
+
This crawler takes as input the twitter handle of a person we think posts similar content to us or is an account we admire. It completes the following steps to use their information to find people we should consider following:  
It completes the following steps to use their information to find people we should consider following:  
 
 
1) Crawls the tweets of that user and notes, for each tweet, how many times a buzzword (entrepreneur((s)hip), research(ers), innovat(e)(ion)) appears  
 
1) Crawls the tweets of that user and notes, for each tweet, how many times a buzzword (entrepreneur((s)hip), research(ers), innovat(e)(ion)) appears  
 
2) Composes a list of the best tweets (most buzzwords) produced by the account in it's most recent 50 tweets.  
 
2) Composes a list of the best tweets (most buzzwords) produced by the account in it's most recent 50 tweets.  

Revision as of 17:12, 6 October 2016

People to Follow Crawl

Description

This crawler takes as input the twitter handle of a person we think posts similar content to us or is an account we admire. It completes the following steps to use their information to find people we should consider following: 1) Crawls the tweets of that user and notes, for each tweet, how many times a buzzword (entrepreneur((s)hip), research(ers), innovat(e)(ion)) appears 2) Composes a list of the best tweets (most buzzwords) produced by the account in it's most recent 50 tweets. 3) Crawls the people who retweeted the tweets with the most buzzwords. 4) Makes note of how many times a buzzwords was used, for each of the retweeters. 5) Outputs a csv file which gives the usernames and a score (number of buzzwords) for each of the users.

Development

Functions authenticationAndAccess_interface, jsonDataAcquisition, retweetersIdAcquisition, retweetersShortnameAcquisition and generate_pandas_table_filledWithZeroes were taken from http://mcnair.bakerinstitute.org/wiki/Twitter_Webcrawler_(Tool) aka Gunny's existing Twitter Crawler.

I had major issues with Rate Limits and eventually found a solution here: http://python-twitter.readthedocs.io/en/latest/rate_limits.html and read about the Rate Limit nonsense in general here: https://dev.twitter.com/rest/public/rate-limiting

I used this page http://python-twitter.readthedocs.io/en/latest/twitter.html#module-twitter.api for an examination of all the methods I can use(and probably will use after I finish this application of the crawler)