Changes

Jump to navigation Jump to search
no edit summary
10/17/2016 2:00-5:00 pm: Created personal wiki page as well as work log; Read about the research project to which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links===Fall 2017===<onlyinclude>
10[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]] 9/11/2017 2:00-5:00 pm*Spoke to Ed about the project going forward. Organized the current updated data for our project. 9/12/2017 3:00-5:00 pm*Began going through the Cleaned Cohort Data Excel file and found a few problems with it. Will continue the cleaning process for the rest of the week. 9/1813/2016 42017 2:00-65:00 pm*Sorted through Cleaned Cohort Data and finalized our List of Accelerators. We can begin the process of creating our PercentVC table. 9/14/2017 3:00-5: 00 pm*Completely finalized our dataset of accelerators and startups. Met with Michelle Passo to discuss objectives of the research partner Shrey who filled for credit course. 9/18/2017 2:00-4:00 pm*Talked with Peter about the LinkedIn crawler data. Went through VC page that Meghana sent me . 9/19/2017 3:00-5:00 pm*Completed SDC pull of updated VC Data. 9/20/2017 2:00-5:00 pm*Attempted several times to run the Matcher. Cleaned our pulled data. 9/21/2017 3:00-5:00 pm*Came extremely close to running the Matcher the correctly. Reviewed the final LinkedIn data from Peter. 9/25/2017 2:00-5:00 pm*Finalized the matched file of accelerator companies with VC portfolio companies. Gave Ben the data on Georgia accelerators. 9/26/2017 3:00-5:00 pm*Worked on finding the duplicates in our Matched file in order to have the most accurate data. 9/27/2017 2:00-5:00 pm*Attempted to find a way to organize the duplicate matches. 9/28/2017 4:00-5:00 pm*Continued running through matched data in order to organize it effectively. 10/2/2017 2:00-5:00 pm*Talked to Ed about next steps for the project. Practiced accessing the crunchbase database on where SQL. Brushed up on SQL code. 10/3/2017 3:00-5:00 pm*Searched the database for crunchbase investment information. 10/4/2017 2:00-5:00 pm*Pulled the funding rounds table from SQL and matched it with the companies that have received VC funding in order to gather round dates. 10/6/2017 3:00-5:00 pm*Went through the matched data. Brainstormed ways to get the dates for cohort companies going through accelerators. 10/11/2017 2:00-3:30 pm:*Looked into using the WhoIs Parser in order to find when the companies went through their accelerators. 10/12/2017 3:00-5:00 pm*Discovered that the Wayback Machine will not be a good option for finding when companies went through their accelerators. Created a list of VCCompanies and their earliest round date. Included a column for the date they went through their accelerators and will fill it in when we are find a good method of finding this date. 10/16/2017 2:00-3:30 pm*Continued working on sorting VCCompanies by their earliest round date. 10/17/2017 3:00-5:00 pm*Worked with Ben to find a solution to our problem of data acquisition. Finalized earliest round date for VCCompanies. 10/18/2017 2:00-5:00 pm*Updated our VC data with Ed's help in order to increase the accuracy and completion of our data. 10/19/2017 3:00-5:00 pm*Organized all of our matched data and updated it in order to reflect the most recent SDC pull with Ed. Matched Crunchbase data with our cohort companies. 10/20/2017 2:00-3:30 pm*Generated the project; Began looking new list of VCCompanies as well as their earliest round dates. 10/23/2017 2:00-3:30 pm*Worked on sorting out the discrepancies in our matched data. 10/24/2017 3:00-5:00 pm*Went through list of VCCompanies and began adding respective accelerators in order to proceed with VCPercentage table. 10/25/2017 2:00-5:00 pm*Continued going through list of VCCompanies and adding accelerators. 10/26/2017 3:30-5:30 pm*Continued going through list of VCCompanies and adding accelerators. Will have this completed on websites Monday. 10/30/2017 2:00-3:30 pm*Finished adding all of certain the accelerators to the list of VCCompanies. Added a column indicating whether or not the company went through two or more accelerators. 10/31/2017 3:00-5:00 pm*Began compiling data in the column for Date Company went through Accelerator. 11/1/2017 2:00-4:00 pm*Finalized entering dates for how Y Combinator cohort companies. 11/2/2017 4:00-5:30 pm*Continued entering cohort company dates into Excel file. 11/6/2017 2:00-4:00 pm*Continued entering cohort company dates into Excel file. Began compiling a list of keywords for demo day press releases. 11/7/2017 3:00-5:00 pm*Finished coming up with keywords for demo day crawler. Sent the final list to Peter. 11/8/2017 2:00-3:30 pm*Spoke to determine their cohorts Ed and organized all of our current data. 11/9/2017 3:00-5:00 pm*Created a new project page called Accelerator Data and listed these steps on all relevant files as well as descriptions. 11/14/2017 3:00-5:00 pm*Looked up URLs and decided whether or not the wikiwebiste was relevant.
1011/1915/2016 2017 2:00-5:00 pm: Finished looking on the remaining accelerator websites *Created SQL database entitled "acceleratordata" and wrote the steps on determining how to manually locate the cohortsbegan creating tables from folder of All Relevant Files.
1011/2016/2016 42017 3:00-65:00 pm: Met with Peter and Christy *Continued to discuss the possibility of creating a web crawler that will pull data from individual accelerator sitesinput tables into SQL database.
1011/2420/2016 2017 2:00-5:00 pm: Brainstormed with Albert and Julia about changes to the category name for SBDE. Spoke *Cleaned text files in order to Ed about full scope of accelerator projectimport tables into SQL database.
1011/2527/2016 42017 2:00-65:00 pm: Brainstormed *Worked with Shrey about different potential industry focuses within accelerators, as well as different variables Peter to search for in terms of accelerators, startups, cohorts, etcfind and exclude irrelevant keywords on HTML pages. Began categorizing relevant demo day pages.
1011/2628/2016 22017 3:00-5:00 pm: Began searching for more databases including lists of accelerators as well as some characteristics *Finished inputting tables of those accelerators; Began searching for characteristics that identify accelerators on their websitesrelevant files into SQL database.
1011/2729/2016 42017 2:00-65:00 pm: Continued searching for relevant lists of accelerators to include *Went through accelerator HTML URLs. Spoke with Ed about going through HTMLs and classifying based on our page. Added some links that have high potential under the tab (Obtained from List of Accelerators or various Google searches)overall and specific relevance.
1012/311/2016 22017 3:00-5:00 pm: Began constructing a list of variables that clearly distinguish an *Worked through accelerator links and classified pages based on its website. This is in an effort to allow a crawler to crawl through many Google searches and identify acceleratorswhether or not they provided relevant information about startup timing.
1112/14/2016 42017 10:00-612:00 pm: *Continued looking for variables that could identify accelerators from their websites. Searched running through numerous different websites of accelerators obtained from our current databasesdemo day crawl URLs and scoring them based on relevance.
1112/27/2016 22017 1:00-4:00 30 pm: Continued combing through websites *Finalized scoring of numerous accelerators, well-known and other, in demo day URLs for the hopes original crawl. Last day of finding identifying variableswork for this semester.
11</3/2016 4:00-6:00 pm: Finalized my list of variables that could be used to distinguish the websites of accelerators. Slightly re-arranged our list of accelerator databases in order of relevance.onlyinclude>
11/7/2016 2:00-5:00 pm: Began compiling the list of all accelerators. Created a new TextPad document with information from a new database.===Spring 2017===
111/818/2016 42017 1:00-65:00 pm: Worked with Shrey and Ben in order to compile all of our *Continued collecting data for accelerator databases into one long list on Textpadproject. Helped Catherine draft tweets for the McNair Center twitter account.
111/920/2016 22017 1:00-53:00 pm: *Continued formulating a database for all collecting data on accelerators and all of the available info given. Attended McNair Center team meeting.
111/1023/2016 42017 1:00-65:00 pm: Worked with Shrey *Began combing through accelerator list, determining which accelerators are still missing data and Peter documenting these in order to develop a crawler for f6sTextPad file. Finished through #115.
111/1425/2016 22017 1:00-5:00 pm: Began sorting the Seed-DB database in an Excel document*Continued looking through accelerator list.
111/1527/2016 42017 1:00-63:00 pm: Conducted some Google searches in an attempt to find more *Continued going through accelerator databaseslist. Began looking through Executive Orders searching for keywordsLeft off on #226 with Shrey.
111/1620/2016 22017 1:00-5:00 pm: Completed searching *Continued going through accelerator list. Finished through Executive Orders#440.
112/171/2016 42017 1:00-65:00 pm: Continued working on Google searches *Finished going through the list of accelerators looking for state accelerator listincomplete files. Looked through f6s for common words Began completing the files that can be used to distinguish accelerators once we have finalized the crawlerwere not done.
112/213/2016 22017 1:00-53:00 pm: Randomly chose 10 accelerators from Excel list of accelerators *Continued working on the RDP. Went through each website and listed the steps that I took in order to determine whether or not the website belonged to an completing accelerator. Will continue extracting cohort information tomorrowfiles.
112/226/2016 42017 1:00-64:00 30 pm: Listed out all steps for extracting cohort information from the ten randomly chosen *Finished data set of accelerators. Worked with Peter in order to build a tool Began going through and making sure that will search all text files and cohort files are of the HTMLs and attempt to identify each one as same format so Peter can easily pull the information. Left for 30 minutes for an accelerator as well as extract some basic informationinterview from 2:30-3:00 pm.
112/288/2016 22017 1:00-5:00 pm: Merged the F6S accelerator list *Finished formatting through #137. Spoke with our other list, then posted it on the Ed about project page. Learned process for accelerator data extraction from Ed.
112/2913/2016 42017 1:00-65:00 pm: Began process of collecting data from the 20 accelerators that I am responsible *Completed formatting forall accelerator text files.
112/3015/2016 22017 3:00-5:00 pm: Continued collecting *Made copy of the completed data from acceleratorsset. Finished 15/20Spoke to Ed about future steps to take for project including gathering founder data and obtaining the crunchbase api.
122/17/2017 1/2016 4:00-63:00 pm: Continued collecting data from accelerators*Went through final Excel spreadsheet for cohort information. Finished original 20, picked up a new set Still need to run the crawler one more time after the completion of 20the editing process. Found the application for the crunchbase api which will hopefully allow us to gain access.
122/220/2016 22017 1:00-5:00 pm: Continued collecting data from accelerators. Finished next 20*Filled out another application for Crunchbase research access; Found the first source for the incubator project on angel.co, will hopefully work with Peter to make a crawler similar to f6s
122/822/2016 2017 1:00-35:00 pm: Completed collecting *Pulled data from accelerators SDC for Ed and normalized it. Learned how to use SDC and the semesternormalizer.
12/1824/2017 1:00-53:00 pm: Continued collecting *Finished cleaning up the cohort data for accelerator project. Helped Catherine draft tweets for Y-combinator on the McNair Center twitter accountFinal Cohort Excel Spreadsheet.
12/2027/2017 1:00-35:00 pm: *Continued collecting cleaning up the cohort data on acceleratorsin the Excel file. Attended McNair Center team meetingFinished Cohort Number and Year.
3/1/23/2017 12:00-5:00 pm: Began combing through accelerator list, determining which accelerators are still missing *Worked with Ben and Shrey to pull data from SDC for all VC funded companies and documenting these normalized it to put it in a TextPad file. Finished through #115an Excel document.
13/253/2017 1:00-52:00 30 pm: Continued looking through accelerator list*Worked with Ben to try and repeat down the VC data without it going too far.
13/276/2017 1:00-34:00 pm: Continued going *Worked with Shrey to finish cleaning the cohort data. It is ready to be run through accelerator list. Left off on #226 the matcher with ShreyBen.
13/208/2017 1:00-5:00 pm: Continued going through accelerator *Matched the VC Data with the list of Cohort Companies and got one list. Finished through #440of all cohort companies that have received VC funding.
23/110/2017 112:00-52:00 pm: Finished going through *Put a write-up on the list top of accelerators looking for incomplete files. Began completing the files that were not doneAccelerator wiki page detailing where we are in the project currently as well as what data we have accumulated on the RDP.
23/320/2017 1:00-35:00 pm: Continued working on completing accelerator files*Began gathering the URLs of all accelerators in a TextPad file called Accelerator URLs. Participated in the SQL training session.
23/622/2017 1:00-45:30 00 pm: Finished data set of accelerators. Began going through *Made tables in Terminal for Accelerator companies matched with VC companies and making sure that all text files and cohort files are of the same format so Peter can easily pull the information. Left for 30 minutes for an interview from 2:30-3:00 pmCohort Data.
23/827/2017 1:00-54:00 pm: Finished formatting through #137. Spoke with Ed about project*Compiled all URLs of accelerator into a TextPad file.
23/1329/2017 1:00-5:00 pm: Completed formatting for all accelerator text files*Worked on the matched data with Ben. Next time I will run the RegEx code that will filter the URLs, and I will look through the duplicates where two different VC backed company names matched to one cohort company name.
23/1531/2017 31:00-52:00 pm: Made copy of *Ran the completed data set. Spoke code for accelerator urls which are ready to Ed about future steps be run through the wayback machine in order to take for project including gathering founder data and obtaining get the crunchbase apistart dates. Also began looking through vc backed company names.
24/173/2017 1:00-35:00 pm: Went *Continued looking through final Excel spreadsheet for cohort informationdouble matched VC companies. Still need to run the crawler one Learned more time after the completion of the editing process. Found the application for the crunchbase api which will hopefully allow us to gain accessSQL from Ed.
24/205/2017 1:00-5:00 pm: Filled out another application for Crunchbase research access; Found *Made the first source final vc percentage table on terminal and for the incubator project on angelnext time I will collect missing accelerator data.co, will hopefully work with Peter to make a crawler similar to f6s
24/227/2017 1:00-53:00 pm: Pulled *Began collecting cohort data for big accelerators that were missing from SDC for Ed and normalized our list in order to add it. Learned how to use SDC and the normalizerour final list of cohort companies.
24/2410/2017 1:00-35:00 pm: *Finished cleaning up the gathering cohort data company names for Y-combinator on big accelerators that we were missing and put them into the Final Cleaned Cohort Companies Excel Spreadsheetfile. Ben is looking through Crunchbase data in order to possibly find more missing accelerators.
24/2714/2017 1:00-54:00 pm: Continued cleaning up the cohort data in *Began working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes on the Excel fileones that I was able to go through. Finished Cohort Number and YearNeed to finish this textpad before moving forward.
34/117/2017 21:00-54:00 pm: Worked with Ben *Continued going through potential Crunchbase accelerators that we may have missed. Talked to Ed about getting a more comprehensive list from Excel file and by the end of the semester have the tables and Shrey to pull data from SDC for all VC funded companies collected and normalized it to put it in an Excel documentdone.
34/319/2017 1:00-24:30 00 pm: *Worked with Ben Jeemin to generate an entire list of potential US accelerators from crunchbase. Worked to find a way to try and repeat down the VC data without it going too farclassify accelerators just based on their descriptions.
34/621/2017 : 1:00-4:00 pm: Worked with Shrey to finish cleaning *Continued working through the list identifying accelerators that we do not have. Ramee and Juliette are now helping us gather cohort data. It is ready to be run through the matcher with Benfor those missing accelerators.
34/824/2017 19:00-51:00 pm: Matched the VC Data with the list *Updated Veeral on current state of Cohort Companies and got one project. Typed up a to-do list of all on the discussion wiki for Veeral. Got new cohort companies that have received VC fundingdata on an accelerator and added it to Excel file.
5/3/10/2017 1211:00-21:00 pm: Put a write-up on the top *Talked to Ed and Anne about future report. Continued working through list of crunchbase potential accelerators. Last day of the Accelerator wiki page detailing where we are in the project currently as well as what data we have accumulated on the RDPwork for this semester.
3/20/2017 1:00-5:00 pm: Began gathering the URLs of all accelerators in a TextPad file called Accelerator URLs. Participated in the SQL training session.===Fall 2016===
310/2217/2017 12016 2:00-5:00 pm: Made tables in Terminal for Accelerator companies matched with VC companies *Created personal wiki page as well as work log; Read about the research project to which I have been assigned; Wrote a short summary of what I believe it is and for Cohort Data.included some helpful links
310/2718/2017 12016 4:00-46:00 pm: Compiled all URLs *Met with research partner Shrey who filled me in on where we are with the project; Began looking on websites of accelerator into a TextPad file.certain accelerators for how to determine their cohorts and listed these steps on the wiki
310/2919/2017 12016 2:00-5:00 pm: Worked *Finished looking on the matched data with Ben. Next time I will run the RegEx code that will filter the URLs, remaining accelerator websites and I will look through wrote the duplicates where two different VC backed company names matched steps on determining how to one cohort company namemanually locate the cohorts.
310/3120/2017 12016 4:00-26:00 pm: Ran *Met with Peter and Christy to discuss the code for possibility of creating a web crawler that will pull data from individual accelerator urls which are ready to be run through the wayback machine in order to get the start dates. Also began looking through vc backed company namessites.
410/324/2017 12016 2:00-5:00 pm: Continued looking through double matched VC companies*Brainstormed with Albert and Julia about changes to the category name for SBDE. Learned more SQL from Spoke to Edabout full scope of accelerator project.
410/525/2017 12016 4:00-56:00 pm: Made the final vc percentage table on terminal and *Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables to search for next time I will collect missing accelerator datain terms of accelerators, startups, cohorts, etc.
410/726/2017 12016 2:00-35:00 pm: *Began collecting cohort data searching for big more databases including lists of accelerators as well as some characteristics of those accelerators ; Began searching for characteristics that were missing from our list in order to add it to our final list of cohort companies.identify accelerators on their websites
410/1027/2017 12016 4:00-56:00 pm: Finished gathering cohort company names *Continued searching for big relevant lists of accelerators to include on our page. Added some links that we were missing and put them into have high potential under the Cleaned Cohort Companies Excel file. Ben is looking through Crunchbase data in order to possibly find more missing acceleratorstab (Obtained from List of Accelerators or various Google searches).
410/1431/2017 12016 2:00-45:00 pm: *Began working through "Crunchbase Potential Accelerators" textpad constructing a list of variables that may contain missing accelerators and wrote notes clearly distinguish an accelerator on the ones that I was able its website. This is in an effort to allow a crawler to go crawl through. Need to finish this textpad before moving forwardmany Google searches and identify accelerators.
411/171/2017 12016 4:00-46:00 pm: *Continued going looking for variables that could identify accelerators from their websites. Searched through potential Crunchbase numerous different websites of accelerators that we may have missed. Talked to Ed about getting a more comprehensive list obtained from Excel file and by the end of the semester have the tables and data collected and doneour current databases.
411/192/2017 12016 2:00-4:00 pm: Worked with Jeemin to generate an entire list *Continued combing through websites of potential US numerous accelerators from crunchbase. Worked to find a way to classify accelerators just based on their descriptions, well-known and other, in the hopes of finding identifying variables.
11/3/2016 4:00-6:00 pm*Finalized my list of variables that could be used to distinguish the websites of accelerators. Slightly re-arranged our list of accelerator databases in order of relevance. 11/7/2016 2:00-5:00 pm*Began compiling the list of all accelerators. Created a new TextPad document with information from a new database. 11/8/2016 4:00-6:00 pm*Worked with Shrey and Ben in order to compile all of our accelerator databases into one long list on Textpad. 11/9/2016 2:00-5:00 pm*Continued formulating a database for all accelerators and all of the available info given. 11/10/2016 4:00-6:00 pm*Worked with Shrey and Peter in order to develop a crawler for f6s. 11/2114/20172016 2: 100-5:00pm*Began sorting the Seed-DB database in an Excel document. 11/15/2016 4:00-6:00 pm*Conducted some Google searches in an attempt to find more accelerator databases. Began looking through Executive Orders searching for keywords. 11/16/2016 2:00-5:00 pm*Completed searching through Executive Orders. 11/17/2016 4:00-6: 00 pm*Continued working on Google searches for state accelerator list. Looked through f6s for common words that can be used to distinguish accelerators once we have finalized the crawler. 11/21/2016 2:00-5:00 pm*Randomly chose 10 accelerators from Excel list identifying of accelerators on the RDP. Went through each website and listed the steps that we do I took in order to determine whether or not havethe website belonged to an accelerator. Will continue extracting cohort information tomorrow. 11/22/2016 4:00-6:00 pm*Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. Ramee Worked with Peter in order to build a tool that will search all of the HTMLs and Juliette are now helping us gather cohort attempt to identify each one as an accelerator as well as extract some basic information. 11/28/2016 2:00-5:00 pm*Merged the F6S accelerator list with our other list, then posted it on the project page. Learned process for accelerator data extraction from Ed. 11/29/2016 4:00-6:00 pm*Began process of collecting data from the 20 accelerators that I am responsible for those missing . 11/30/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished 15/20. 12/1/2016 4:00-6:00 pm*Continued collecting data from accelerators. Finished original 20, picked up a new set of 20. 12/2/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished next 20. 12/8/2016 1:00-3:00 pm*Completed collecting data from accelerators for the semester.
[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]]
[[Category:Work Log]]

Navigation menu