Difference between revisions of "Christy Warden (Work Log)"

From edegan.com
Jump to navigation Jump to search
Line 159: Line 159:
  
 
''10-12:45'' Worked on the enclosing circle problem. Wrote and completed a program which guarantees a perfect outcome but takes forever to run because it checks all possible outcomes. I would like to maybe rewrite it or improve it so that it outputs a good solution, but not necessarily a perfect one so that we can run the program on larger quantities of data. Also today I discussed the cohort data breakdown with Peter and checked through the twitter code. Automate.py seems to be working perfectly now, and I would like someone to go through the content with me so that I can filter it more effectively. Autofollower appears to be failing but not returning any sort of error code? I've run it a few different times and it always bottlenecks somewhere new, so I suspect some sort of data limiting on twitter is preventing this algorithm from working. Need to think of a new one.
 
''10-12:45'' Worked on the enclosing circle problem. Wrote and completed a program which guarantees a perfect outcome but takes forever to run because it checks all possible outcomes. I would like to maybe rewrite it or improve it so that it outputs a good solution, but not necessarily a perfect one so that we can run the program on larger quantities of data. Also today I discussed the cohort data breakdown with Peter and checked through the twitter code. Automate.py seems to be working perfectly now, and I would like someone to go through the content with me so that I can filter it more effectively. Autofollower appears to be failing but not returning any sort of error code? I've run it a few different times and it always bottlenecks somewhere new, so I suspect some sort of data limiting on twitter is preventing this algorithm from working. Need to think of a new one.
 +
 +
 +
'''1/25/17'''
 +
 +
''10-12:45'' Simultaneously worked twitter and enclosing circle because they both have a long run time. I realized there was an error in my enclosing circle code which I have corrected and tested on several practice examples. I have some idea for how to speed up the algorithm when we run it on a really large input, but I need more info about what the actual data will look like. Also, the program runs much more quickly now that I corrected the error.
 +
 +
For twitter, I discovered that the issues I am having lies somewhere in the follow API so for now, I've commented it out and am running the program minus the follow component to assure that everything else is working. So far, I have not seen any unusual behavior, but the program has a long wait period so it is taking a while to test.

Revision as of 14:27, 25 January 2017

09/15/16:

2-4:45: Was introduced to the Wiki, built my page and was added to the RDP and Slack. Practiced basic Linux with Harsh and was introduced to the researchers.

09/20/16

2-2:30: Was introduced to the DB server and how to access it/mount bulk drive in the RDP. 2:30-3 Tried (and failed) to help Will upload his file to his database.

3-4:45: Learned from Harsh how to transfer Will's file between machines so that he could access it for his table (FileZilla/ Putty, but really we should've just put it in the RDP mounted bulk drive we built at the beginning.)

09/22/16

2-2:30: Labeled new supplies (USB ports). Looked online for a solution to labeling the black ports, sent link with potentially useful supplies to Dr. Dayton.

2:30-3:Went through all of the new supplies plus monitors, desktops and mice) and created Excel sheet to keep track of them (Name, Quantity, SN, Link etc.)

3-3:15: Added my hours to the wiki Work Hours page, updated my Work Log.

09/27/16

2-2:25: Read through the wiki page for the existing twitter crawler/example. Rest of time: Worked on adjusting our feeds for HootSuite and making the content on it relevant to the people writing the tweets/blogs. Christy Warden (Social Media)

This is a link to all of the things I did to the HootSuite and brainstorming about how to up our twitter/social media/blog presence.

09/29/16

Everything I did is inside of my social media research page http://mcnair.bakerinstitute.org/wiki/Christy_Warden_(Social_Media) I got the twitter crawler running and have created a plan for how to generate a list of potential followers/ people worth following to increase our twitter interactions and improve our feed to find stuff to retweet.

10/4/16

11-12:30: Directed people to the ambassador event.

12:30-3: work on my crawler (can be read about on my social media page)

3-4:45:donald trump twitter data crawl.

10/6/16

12:15-4:45: Worked on the Twitter Crawler. It currently takes as input a name of a twitter user and returns the active twitter followers on their page most likely to engage with our content. I think my metric for what constitutes a potential follower needs adjusting and the code needs to be made cleaner and more helpful. Project is in Documents/Projects/Twitter Crawler in the RDP. More information and a link to the page about the current project is on my social media page Christy Warden (Social Media)

10/18/16

1-2:30:updated the information we have for the Donald Trump tweets. The data is in the Trump Tweets project in the bulk folder and should have his tweets up until this afternoon when I started working.

2:30-5:Continued (and completed a version of) the twitter crawler. I have run numerous example users through the crawler and checked the outputs to see if the people I return are users that would be relevant to @BakerMcNair and generally they are. Christy Warden (Social Media) for more information

5 - 5:30: Started reading about the existing eventbrite crawler and am brainstorming ideas for how we could use it. (Maybe incorporate both twitter and eventbrite into one application?)

10/25/16

12:15-4:45: Worked on the Twitter Crawler. I am currently collecting data by following around 70-80 people while I am at work and measuring the success of the follow so that I can adjust my program to make optimal following decisions based on historical follow response. More info at Christy Warden (Social Media)

10/27/16

12:15-3: First I ran a program that unfollowed all of the non-responders from my last follow spree and then I updated by datas about who followed us back. I cannot seem to see a pattern yet in the probability of someone following us back based on the parameters I am keeping track of, but hopefully we will be able to see something with more data. Last week we had 151 followers, at the beginning of today we had 175 follows and by the time that I am leaving (4:45) we have 190 followers. I think the program is working, but I hope the rate of growth increases.

3-4 SQL Learning with Ed

4-4:45 Found a starter list of people to crawl for Tuesday, checked our stats and ran one more starting position through the crawler. Updated data sheets and worklog. The log of who I've followed (and if they've followed back) are all on the twitter crawler page.


11/1/16

12:15 - 2: Unfollowed the non responders, followed about 100 people using the crawler. Updated my data sheets about how people have responded and added all the new followers to the log on Christy Warden (Social Media) twitter crawler page.

2-4:45 Prepped the next application of my twitter crawling abilities, which is going to be a constantly running program on a dummy account which follows a bunch of new sources and dms the McNair account when something related to us shows up.


11/3/16

12:15-12:30: I made a mistake today! I intended to fix a bug that occurred in my DM program, but accidentally started running a program before copying the program's report about what went wrong so I could no longer access the error report. I am running the program again between now and Thursday and hoping to run into the same error so I can actually address it. (I believe it was something to do with a bad link). I did some research about catching and fixing exceptions in a program while still allowing it to continue, but I can't really fix the program until I have a good example of what is going wrong.

12:30 - 2:30: Unfollowed the non responders, followed about 100 people using the crawler. Updated my data sheets about how people have responded and added all the new followers to the log on Christy Warden (Social Media) twitter crawler page. I've noticed that our ratios of successful returns of our follow are improving, I am unsure whether I am getting better at picking node accounts or whether our account is gaining legitimacy because our ratio is improving.

2-4:15 I had the idea after my DM program which runs constantly had (some) success, that I could make the follow crawler run constantly too? I started implementing a way to do this, but haven't had a chance to run or test it yet. This will present serious difficulties because I don't want to do anything that could potentially get us kicked off twitter/ lose my developer rights on our real account. It is hard to use a dummy acct for this purpose though, because nobody will follow back an empty account so it'll be hard to see if the program succeeds in that base case. I will contemplate tonight and work on it Thursday.

4:15-4:30 Started adding comments and print statements and some level of organization in my code in case other/future interns use it and I am not at work to explain how it functions. The code could definitely do with some cleanup, but I think that should probably come later after everything is functional and all of our twitter needs are met.

4:30-4:45 Updated work log and put my thoughts on my social media project page.


11/8/16

12:15-1 Talked to Ed about my project and worked out a plan for the future of the twitter crawler. I will explain all of it on the social media page.

1- 4:45 Worked on updating the crawler. It is going to take awhile but I made a lot of progress today and expect that it should be working (iffily) by next Thursday.


11/10/16

12:15 - 4:45 Tried to fix bug in my retweeting crawler, but still haven't found it. I am going to keep running the program until the error comes up and then log into the RDP as soon as I notice and copy down the error. Worked on changes to the crawler which will allow for automation.


11/15/16

12:15 - 1:30 Changing twitter crawler.

1:30 - 4:45 Worked on pulling all the data for the executive orders and bills with Peter (we built a script in anticipation of Harsh gathering the data from GovTrack which will build a tsv of the data)


11/17/16

12:15 - 1:30 Changing twitter crawler

1:30 - 5:30 Fixed the script Peter and I wrote because the data Harsh gathered ended up being in a slightly different form than what we anticipated. Peter built and debugged a crawler to pull all of the executive orders and I debugged the tsv output. I stayed late while the program ran on Harsh's data to ensure no bugs and discovered at the very very end of the run that there was a minor bug. Fixed it and then left.


11/22/16

12:15- 2 Worked on updating the crawler so that it runs automatically. Ran into some issues because we changed from python 2.7 to anaconda, but got those running again. Started the retweeter crawler, seems to be working well.

2-2:30 Redid the Bill.txt data for the adjusted regexes. Met with Harsh, Ed and Peter about being better at communicating our projects and code.

2:30-4:30 Back to the twitter crawler. I am now officially testing it before we use it on our main account and have found some bugs with data collection that have been adjusted. I realized at the very end of the day that I have a logical flaw in my code that needs to be adjusted because only 1 person at a time goes into the people we followed list. Basically, because of this, we will only be following one person in every 24 hour period. When I get back from Thanksgiving, I need to change the unfollow someone function. The new idea is that I will follow everyone that comes out of a source node, and then call the unfollow function for as long as it will run for while maintaining the condition that the top person on the list was followed for more than one day. I will likely need only one more day to finish this program before it can start running on our account.

4:30 - 4:45 In response to the "start communicating with the comp people" talk, I updated my wiki pages and work log on which I have been heavily slacking.


11/29/16

12:15- 1:45 Fixed code and reran it for gov track project, documented on E&I governance

1:45- 2 Had accelerator project explained to me

2 - 2:30 Built histograms of govtrack data with Ed and Albert, reran data for Albert.

2:30-4:45 Completed first 5 reports (40-45) on accelerators (accidentally did number 20 as well)


12/1/16

12:15- 3 Fixed the perl code that gets a list of all Bills that have been passed, then composed new data of Bills with relevant buzzword info as well as whether or not they were enacted.

3 - 4:45 Worked on Accelerators data collection.


1/18/17

10-12:45 Starting running old twitter programs and reviewing how they work. Automate.py is currently running and AutoFollower is in the process of being fixed.


1/20/17

10-11 Worked on twitter programs. Added error handling for Automate.py and it appears to be working but I will check on Monday.

11-11:15 Talked with Ed about projects that will be done this semester and what I'll be working on.

11:15 - 12 Went through our code repository and made a second Wiki page documenting the changes since it has last been completed. http://mcnair.bakerinstitute.org/wiki/Software_Repository_Listing_2

12-12:45 Worked on the smallest enclosing circle problem for location of startups.


1/23/17

10-12:45 Worked on the enclosing circle problem. Wrote and completed a program which guarantees a perfect outcome but takes forever to run because it checks all possible outcomes. I would like to maybe rewrite it or improve it so that it outputs a good solution, but not necessarily a perfect one so that we can run the program on larger quantities of data. Also today I discussed the cohort data breakdown with Peter and checked through the twitter code. Automate.py seems to be working perfectly now, and I would like someone to go through the content with me so that I can filter it more effectively. Autofollower appears to be failing but not returning any sort of error code? I've run it a few different times and it always bottlenecks somewhere new, so I suspect some sort of data limiting on twitter is preventing this algorithm from working. Need to think of a new one.


1/25/17

10-12:45 Simultaneously worked twitter and enclosing circle because they both have a long run time. I realized there was an error in my enclosing circle code which I have corrected and tested on several practice examples. I have some idea for how to speed up the algorithm when we run it on a really large input, but I need more info about what the actual data will look like. Also, the program runs much more quickly now that I corrected the error.

For twitter, I discovered that the issues I am having lies somewhere in the follow API so for now, I've commented it out and am running the program minus the follow component to assure that everything else is working. So far, I have not seen any unusual behavior, but the program has a long wait period so it is taking a while to test.