Changes

Jump to navigation Jump to search
no edit summary
10/17/2016 2:00-5:00 pm: Created personal wiki page as well as work log; Read about the research project to which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links===Fall 2017===<onlyinclude>
10/18/2016 4:00-6:00 pm: Met with research partner Shrey who filled me in on where we are with the project; Began looking on websites of certain accelerators for how to determine their cohorts and listed these steps on the wiki[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]]
109/1911/2016 2017 2:00-5:00 pm: Finished looking on the remaining accelerator websites and wrote *Spoke to Ed about the steps on determining how to manually locate project going forward. Organized the cohortscurrent updated data for our project.
109/2012/2016 42017 3:00-65:00 pm: Met *Began going through the Cleaned Cohort Data Excel file and found a few problems with Peter and Christy to discuss it. Will continue the cleaning process for the possibility rest of creating a web crawler that will pull data from individual accelerator sitesthe week.
109/2413/2016 2017 2:00-5:00 pm: Brainstormed with Albert *Sorted through Cleaned Cohort Data and Julia about changes to finalized our List of Accelerators. We can begin the category name for SBDE. Spoke to Ed about full scope process of accelerator projectcreating our PercentVC table.
109/2514/2016 42017 3:00-65:00 pm: Brainstormed *Completely finalized our dataset of accelerators and startups. Met with Shrey about different potential industry focuses within accelerators, as well as different variables Michelle Passo to search discuss objectives of the research for in terms of accelerators, startups, cohorts, etccredit course.
109/2618/2016 2017 2:00-54:00 pm: Began searching for more databases including lists of accelerators as well as some characteristics of those accelerators; Began searching for characteristics *Talked with Peter about the LinkedIn crawler data. Went through VC page that identify accelerators on their websitesMeghana sent me.
109/2719/2016 42017 3:00-65:00 pm: Continued searching for relevant lists of accelerators to include on our page. Added some links that have high potential under the tab (Obtained from List *Completed SDC pull of Accelerators or various Google searches)updated VC Data.
109/3120/2016 2017 2:00-5:00 pm: Began constructing a list of variables that clearly distinguish an accelerator on its website*Attempted several times to run the Matcher. This is in an effort to allow a crawler to crawl through many Google searches and identify acceleratorsCleaned our pulled data.
119/121/2016 42017 3:00-65:00 pm: Continued looking for variables that could identify accelerators from their websites*Came extremely close to running the Matcher the correctly. Searched through numerous different websites of accelerators obtained Reviewed the final LinkedIn data from our current databasesPeter.
119/225/2016 2017 2:00-45:00 pm: Continued combing through websites *Finalized the matched file of numerous accelerator companies with VC portfolio companies. Gave Ben the data on Georgia accelerators, well-known and other, in the hopes of finding identifying variables.
119/26/2017 3/2016 4:00-65:00 pm: Finalized my list of variables that could be used to distinguish *Worked on finding the websites of accelerators. Slightly re-arranged duplicates in our list of accelerator databases Matched file in order of relevanceto have the most accurate data.
119/727/2016 2017 2:00-5:00 pm: Began compiling *Attempted to find a way to organize the list of all accelerators. Created a new TextPad document with information from a new databaseduplicate matches.
119/828/2016 2017 4:00-65:00 pm: Worked with Shrey and Ben *Continued running through matched data in order to compile all of our accelerator databases into one long list on Textpadorganize it effectively.
1110/92/2016 2017 2:00-5:00 pm: Continued formulating a database *Talked to Ed about next steps for all accelerators and all of the available info givenproject. Practiced accessing the crunchbase database on SQL. Brushed up on SQL code.
1110/103/2016 42017 3:00-65:00 pm: Worked with Shrey and Peter in order to develop a crawler *Searched the database for f6scrunchbase investment information.
1110/144/2016 2017 2:00-5:00 pm: Began sorting *Pulled the funding rounds table from SQL and matched it with the Seed-DB database companies that have received VC funding in an Excel documentorder to gather round dates.
1110/156/2016 42017 3:00-65:00 pm: Conducted some Google searches in an attempt *Went through the matched data. Brainstormed ways to find more accelerator databases. Began looking get the dates for cohort companies going through Executive Orders searching for keywordsaccelerators.
10/11/16/2016 2017 2:00-53:00 30 pm: Completed searching *Looked into using the WhoIs Parser in order to find when the companies went through Executive Orderstheir accelerators.
1110/1712/2016 42017 3:00-65:00 pm: Continued working on Google searches *Discovered that the Wayback Machine will not be a good option for state accelerator finding when companies went through their accelerators. Created a listof VCCompanies and their earliest round date. Looked Included a column for the date they went through f6s for common words that can be used to distinguish their accelerators once and will fill it in when we have finalized the crawlerfind a good method of finding this date.
1110/2116/2016 2017 2:00-53:00 30 pm: Randomly chose 10 accelerators from Excel list of accelerators *Continued working on the RDP. Went through each website and listed the steps that I took in order to determine whether or not the website belonged to an accelerator. Will continue extracting cohort information tomorrowsorting VCCompanies by their earliest round date.
1110/2217/2016 42017 3:00-65:00 pm: Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. *Worked with Peter in order Ben to build find a tool that will search all solution to our problem of the HTMLs and attempt to identify each one as an accelerator as well as extract some basic informationdata acquisition. Finalized earliest round date for VCCompanies.
1110/2818/2016 2017 2:00-5:00 pm: Merged *Updated our VC data with Ed's help in order to increase the F6S accelerator list with accuracy and completion of our other list, then posted it on the project page. Learned process for accelerator data extraction from Ed.
1110/2919/2016 42017 3:00-65:00 pm: Began process *Organized all of collecting our matched data from and updated it in order to reflect the 20 accelerators that I am responsible formost recent SDC pull with Ed. Matched Crunchbase data with our cohort companies.
1110/3020/2016 2017 2:00-53:00 30 pm: Continued collecting data from accelerators. Finished 15/20*Generated the new list of VCCompanies as well as their earliest round dates.
1210/123/2016 42017 2:00-63:00 30 pm: Continued collecting *Worked on sorting out the discrepancies in our matched data from accelerators. Finished original 20, picked up a new set of 20.
1210/224/2016 22017 3:00-5:00 pm: Continued collecting data from *Went through list of VCCompanies and began adding respective accelerators. Finished next 20in order to proceed with VCPercentage table.
1210/825/2016 12017 2:00-35:00 pm: Completed collecting data from *Continued going through list of VCCompanies and adding accelerators for the semester.
110/1826/2017 13:0030-5:00 30 pm: *Continued collecting data for accelerator projectgoing through list of VCCompanies and adding accelerators. Helped Catherine draft tweets for the McNair Center twitter accountWill have this completed on Monday.
110/2030/2017 12:00-3:00 30 pm: Continued collecting data on *Finished adding all of the acceleratorsto the list of VCCompanies. Attended McNair Center team meetingAdded a column indicating whether or not the company went through two or more accelerators.
110/2331/2017 13:00-5:00 pm: *Began combing through accelerator list, determining which accelerators are still missing compiling data and documenting these in a TextPad file. Finished the column for Date Company went through #115Accelerator.
11/1/25/2017 12:00-54:00 pm: Continued looking through accelerator list*Finalized entering dates for Y Combinator cohort companies.
111/272/2017 14:00-35:00 30 pm: *Continued going through accelerator list. Left off on #226 with Shreyentering cohort company dates into Excel file.
111/206/2017 12:00-54:00 pm: *Continued going through accelerator entering cohort company dates into Excel file. Began compiling a list. Finished through #440of keywords for demo day press releases.
211/17/2017 13:00-5:00 pm: *Finished going through the list of accelerators looking coming up with keywords for incomplete filesdemo day crawler. Began completing Sent the files that were not donefinal list to Peter.
211/38/2017 12:00-3:00 30 pm: Continued working on completing accelerator files*Spoke to Ed and organized all of our current data.
211/69/2017 13:00-45:30 00 pm: Finished data set of accelerators. Began going through *Created a new project page called Accelerator Data and making sure that listed all text files and cohort relevant files are of the same format so Peter can easily pull the information. Left for 30 minutes for an interview from 2:30-3:00 pmas well as descriptions.
211/814/2017 13:00-5:00 pm: Finished formatting through #137. Spoke with Ed about project*Looked up URLs and decided whether or not the webiste was relevant.
211/1315/2017 12:00-5:00 pm: Completed formatting for all accelerator text files*Created SQL database entitled "acceleratordata" and began creating tables from folder of All Relevant Files.
211/1516/2017 3:00-5:00 pm: Made copy of the completed data set. Spoke to Ed about future steps *Continued to take for project including gathering founder data and obtaining the crunchbase apiinput tables into SQL database.
211/1720/2017 12:00-35:00 pm: Went through final Excel spreadsheet for cohort information. Still need to run the crawler one more time after the completion of the editing process. Found the application for the crunchbase api which will hopefully allow us *Cleaned text files in order to gain accessimport tables into SQL database.
211/2027/2017 12:00-5:00 pm: Filled out another application for Crunchbase research access; Found the first source for the incubator project on angel.co, will hopefully work *Worked with Peter to make a crawler similar to f6sfind and exclude irrelevant keywords on HTML pages. Began categorizing relevant demo day pages.
211/2228/2017 13:00-5:00 pm: Pulled data from SDC for Ed and normalized it. Learned how to use SDC and the normalizer*Finished inputting tables of relevant files into SQL database.
211/2429/2017 12:00-35:00 pm: Finished cleaning up the cohort data for Y-combinator *Went through accelerator HTML URLs. Spoke with Ed about going through HTMLs and classifying based on the Final Cohort Excel Spreadsheetoverall and specific relevance.
212/271/2017 13:00-5:00 pm: Continued cleaning up the cohort data in the Excel file. Finished Cohort Number *Worked through accelerator links and Yearclassified pages based on whether or not they provided relevant information about startup timing.
312/14/2017 210:00-512:00 pm: Worked with Ben and Shrey to pull data from SDC for all VC funded companies *Continued running through demo day crawl URLs and normalized it to put it in an Excel documentscoring them based on relevance.
312/37/2017 1:00-24:30 pm: Worked with Ben to try and repeat down *Finalized scoring of demo day URLs for the VC data without it going too faroriginal crawl. Last day of work for this semester.
3</6/2017 1:00-4:00 pm: Worked with Shrey to finish cleaning the cohort data. It is ready to be run through the matcher with Ben.onlyinclude>
3/8/===Spring 2017 1:00-5:00 pm: Matched the VC Data with the list of Cohort Companies and got one list of all cohort companies that have received VC funding.===
31/1018/2017 121:00-25:00 pm: Put a write-up on the top of the Accelerator wiki page detailing where we are in the *Continued collecting data for accelerator project currently as well as what data we have accumulated on . Helped Catherine draft tweets for the RDPMcNair Center twitter account.
31/20/2017 1:00-53:00 pm: Began gathering the URLs of all *Continued collecting data on accelerators in a TextPad file called Accelerator URLs. Participated in the SQL training sessionAttended McNair Center team meeting.
1/23/2017 1:00-5:00 pm*Began combing through accelerator list, determining which accelerators are still missing data and documenting these in a TextPad file. Finished through #115. 1/25/2017 1:00-5:00 pm*Continued looking through accelerator list. 1/27/2017 1:00-3:00 pm*Continued going through accelerator list. Left off on #226 with Shrey. 1/20/2017 1:00-5:00 pm*Continued going through accelerator list. Finished through #440. 2/1/2017 1:00-5:00 pm*Finished going through the list of accelerators looking for incomplete files. Began completing the files that were not done. 2/3/2017 1:00-3:00 pm*Continued working on completing accelerator files. 2/6/2017 1:00-4:30 pm*Finished data set of accelerators. Began going through and making sure that all text files and cohort files are of the same format so Peter can easily pull the information. Left for 30 minutes for an interview from 2:30-3:00 pm. 2/8/2017 1:00-5:00 pm*Finished formatting through #137. Spoke with Ed about project. 2/13/2017 1:00-5:00 pm*Completed formatting for all accelerator text files. 2/15/2017 3:00-5:00 pm*Made copy of the completed data set. Spoke to Ed about future steps to take for project including gathering founder data and obtaining the crunchbase api. 2/17/2017 1:00-3:00 pm*Went through final Excel spreadsheet for cohort information. Still need to run the crawler one more time after the completion of the editing process. Found the application for the crunchbase api which will hopefully allow us to gain access. 2/20/2017 1:00-5:00 pm*Filled out another application for Crunchbase research access; Found the first source for the incubator project on angel.co, will hopefully work with Peter to make a crawler similar to f6s 2/22/2017 1:00-5:00 pm*Pulled data from SDC for Ed and normalized it. Learned how to use SDC and the normalizer. 2/24/2017 1:00-3:00 pm*Finished cleaning up the cohort data for Y-combinator on the Final Cohort Excel Spreadsheet. 2/27/2017 1:00-5:00 pm*Continued cleaning up the cohort data in the Excel file. Finished Cohort Number and Year. 3/1/2017 2:00-5:00 pm*Worked with Ben and Shrey to pull data from SDC for all VC funded companies and normalized it to put it in an Excel document. 3/3/2017 1:00-2:30 pm*Worked with Ben to try and repeat down the VC data without it going too far. 3/6/2017 1:00-4:00 pm*Worked with Shrey to finish cleaning the cohort data. It is ready to be run through the matcher with Ben. 3/8/2017 1:00-5:00 pm*Matched the VC Data with the list of Cohort Companies and got one list of all cohort companies that have received VC funding. 3/10/2017 12:00-2:00 pm*Put a write-up on the top of the Accelerator wiki page detailing where we are in the project currently as well as what data we have accumulated on the RDP. 3/20/2017 1:00-5:00 pm*Began gathering the URLs of all accelerators in a TextPad file called Accelerator URLs. Participated in the SQL training session. 3/22/2017 1:00-5: 00 pm*Made tables in Terminal for Accelerator companies matched with VC companies and for Cohort Data. 3/27/2017 1:00-4:00 pm*Compiled all URLs of accelerator into a TextPad file. 3/29/2017 1:00-5:00 pm*Worked on the matched data with Ben. Next time I will run the RegEx code that will filter the URLs, and I will look through the duplicates where two different VC backed company names matched to one cohort company name. 3/31/2017 1:00-2:00 pm*Ran the code for accelerator urls which are ready to be run through the wayback machine in order to get the start dates. Also began looking through vc backed company names. 4/3/2017 1:00-5:00 pm*Continued looking through double matched VC companies. Learned more SQL from Ed. 4/5/2017 1:00-5:00 pm*Made the final vc percentage table on terminal and for next time I will collect missing accelerator data. 4/7/2017 1:00-3:00 pm*Began collecting cohort data for big accelerators that were missing from our list in order to add it to our final list of cohort companies. 4/10/2017 1:00-5:00 pm*Finished gathering cohort company names for big accelerators that we were missing and put them into the Cleaned Cohort Companies Excel file. Ben is looking through Crunchbase data in order to possibly find more missing accelerators. 4/14/2017 1:00-4:00 pm*Began working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes on the ones that I was able to go through. Need to finish this textpad before moving forward. 4/17/2017 1:00-4:00 pm*Continued going through potential Crunchbase accelerators that we may have missed. Talked to Ed about getting a more comprehensive list from Excel file and by the end of the semester have the tables and data collected and done. 4/19/2017 1:00-4:00 pm*Worked with Jeemin to generate an entire list of potential US accelerators from crunchbase. Worked to find a way to classify accelerators just based on their descriptions. 4/21/2017: 1:00-4:00 pm*Continued working through the list identifying accelerators that we do not have. Ramee and Juliette are now helping us gather cohort data for those missing accelerators. 4/24/2017 9:00-1:00 pm*Updated Veeral on current state of project. Typed up a to-do list on the discussion wiki for Veeral. Got new cohort data on an accelerator and added it to Excel file. 5/3/2017 11:00-1:00 pm*Talked to Ed and Anne about future report. Continued working through list of crunchbase potential accelerators. Last day of work for this semester. ===Fall 2016=== 10/17/2016 2:00-5:00 pm*Created personal wiki page as well as work log; Read about the research project to which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links 10/18/2016 4:00-6:00 pm*Met with research partner Shrey who filled me in on where we are with the project; Began looking on websites of certain accelerators for how to determine their cohorts and listed these steps on the wiki 10/19/2016 2:00-5:00 pm*Finished looking on the remaining accelerator websites and wrote the steps on determining how to manually locate the cohorts. 10/20/2016 4:00-6:00 pm*Met with Peter and Christy to discuss the possibility of creating a web crawler that will pull data from individual accelerator sites. 10/24/2016 2:00-5:00 pm*Brainstormed with Albert and Julia about changes to the category name for SBDE. Spoke to Ed about full scope of accelerator project. 10/25/2016 4:00-6:00 pm*Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables to search for in terms of accelerators, startups, cohorts, etc. 10/26/2016 2:00-5:00 pm*Began searching for more databases including lists of accelerators as well as some characteristics of those accelerators; Began searching for characteristics that identify accelerators on their websites 10/27/2016 4:00-6:00 pm*Continued searching for relevant lists of accelerators to include on our page. Added some links that have high potential under the tab (Obtained from List of Accelerators or various Google searches). 10/31/2016 2:00-5:00 pm*Began constructing a list of variables that clearly distinguish an accelerator on its website. This is in an effort to allow a crawler to crawl through many Google searches and identify accelerators. 11/1/2016 4:00-6:00 pm*Continued looking for variables that could identify accelerators from their websites. Searched through numerous different websites of accelerators obtained from our current databases. 11/2/2016 2:00-4:00 pm*Continued combing through websites of numerous accelerators, well-known and other, in the hopes of finding identifying variables. 11/3/2016 4:00-6:00 pm*Finalized my list of variables that could be used to distinguish the websites of accelerators. Slightly re-arranged our list of accelerator databases in order of relevance. 11/7/2016 2:00-5:00 pm*Began compiling the list of all accelerators. Created a new TextPad document with information from a new database. 11/8/2016 4:00-6:00 pm*Worked with Shrey and Ben in order to compile all of our accelerator databases into one long list on Textpad. 11/9/2016 2:00-5:00 pm*Continued formulating a database for all accelerators and all of the available info given. 11/10/2016 4:00-6:00 pm*Worked with Shrey and Peter in order to develop a crawler for f6s. 11/14/2016 2:00-5:00 pm*Began sorting the Seed-DB database in an Excel document. 11/15/2016 4:00-6:00 pm*Conducted some Google searches in an attempt to find more accelerator databases. Began looking through Executive Orders searching for keywords. 11/16/2016 2:00-5:00 pm*Completed searching through Executive Orders. 11/17/2016 4:00-6:00 pm*Continued working on Google searches for state accelerator list. Looked through f6s for common words that can be used to distinguish accelerators once we have finalized the crawler. 11/21/2016 2:00-5:00 pm*Randomly chose 10 accelerators from Excel list of accelerators on the RDP. Went through each website and listed the steps that I took in order to determine whether or not the website belonged to an accelerator. Will continue extracting cohort information tomorrow. 11/22/2016 4:00-6:00 pm*Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. Worked with Peter in order to build a tool that will search all of the HTMLs and attempt to identify each one as an accelerator as well as extract some basic information. 11/28/2016 2:00-5:00 pm*Merged the F6S accelerator list with our other list, then posted it on the project page. Learned process for accelerator data extraction from Ed. 11/29/2016 4:00-6:00 pm*Began process of collecting data from the 20 accelerators that I am responsible for. 11/30/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished 15/20. 12/1/2016 4:00-6:00 pm*Continued collecting data from accelerators. Finished original 20, picked up a new set of 20. 12/2/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished next 20. 12/8/2016 1:00-3:00 pm*Completed collecting data from accelerators for the semester.
[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]]
[[Category:Work Log]]

Navigation menu