Difference between revisions of "Twitter Follower Finder (Tool)"

From edegan.com
Jump to navigation Jump to search
Line 1: Line 1:
+
{{McNair Projects
 +
|Project Title=Twitter Follower Finder (Tool)
 +
|Topic Area=Resources and Tools
 +
|Owner=Christy Warden
 +
|Start Term=Fall 2016
 +
|Status=Active
 +
|Deliverable=Tool
 +
|Audience=McNair Staff
 +
|Keywords=Webcrawler, Database, Twitter, API, Python
 +
}}
 
== '''People to Follow Crawl''' ==
 
== '''People to Follow Crawl''' ==
  

Revision as of 17:48, 18 October 2016


McNair Project
Twitter Follower Finder (Tool)
Project logo 02.png
Project Information
Project Title
Start Date
Deadline
Primary Billing
Notes
Has project status
Copyright © 2016 edegan.com. All Rights Reserved.


People to Follow Crawl

Description

This crawler takes as input the twitter handle of a person we think posts similar content to us or is an account we admire. It completes the following steps to use their information to find people we should consider following: 1) Crawls the tweets of that user and notes, for each tweet, how many times a buzzword (entrepreneur((s)hip), research(ers), innovat(e)(ion)) appears 2) Composes a list of the best tweets (most buzzwords) produced by the account in it's most recent 50 tweets. 3) Crawls the people who retweeted the tweets with the most buzzwords. 4) Makes note of how many times a buzzwords was used, for each of the retweeters. 5) Outputs a csv file which gives the usernames and a score (number of buzzwords) for each of the users.

Development

Functions authenticationAndAccess_interface, jsonDataAcquisition, retweetersIdAcquisition, retweetersShortnameAcquisition and generate_pandas_table_filledWithZeroes were taken from http://mcnair.bakerinstitute.org/wiki/Twitter_Webcrawler_(Tool) aka Gunny's existing Twitter Crawler.

I had major issues with Rate Limits and eventually found a solution (or so I think) here: http://python-twitter.readthedocs.io/en/latest/rate_limits.html and read about the Rate Limit nonsense in general here: https://dev.twitter.com/rest/public/rate-limiting

I used this page http://python-twitter.readthedocs.io/en/latest/twitter.html#module-twitter.api for an examination of all the methods I can use(and probably will use after I finish this application of the crawler)