Changes

Jump to navigation Jump to search
no edit summary
10/17/2016 2:00-5:00 pm: Created personal wiki page as well as work log; Read about the research project to which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links===Fall 2017===<onlyinclude>
10[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]] 9/11/2017 2:00-5:00 pm*Spoke to Ed about the project going forward. Organized the current updated data for our project. 9/12/2017 3:00-5:00 pm*Began going through the Cleaned Cohort Data Excel file and found a few problems with it. Will continue the cleaning process for the rest of the week. 9/1813/2016 42017 2:00-65:00 pm*Sorted through Cleaned Cohort Data and finalized our List of Accelerators. We can begin the process of creating our PercentVC table. 9/14/2017 3:00-5: 00 pm*Completely finalized our dataset of accelerators and startups. Met with Michelle Passo to discuss objectives of the research partner Shrey who filled for credit course. 9/18/2017 2:00-4:00 pm*Talked with Peter about the LinkedIn crawler data. Went through VC page that Meghana sent me in on where we are . 9/19/2017 3:00-5:00 pm*Completed SDC pull of updated VC Data. 9/20/2017 2:00-5:00 pm*Attempted several times to run the Matcher. Cleaned our pulled data. 9/21/2017 3:00-5:00 pm*Came extremely close to running the Matcher the correctly. Reviewed the final LinkedIn data from Peter. 9/25/2017 2:00-5:00 pm*Finalized the matched file of accelerator companies with VC portfolio companies. Gave Ben the project; Began looking data on websites of certain Georgia accelerators for how . 9/26/2017 3:00-5:00 pm*Worked on finding the duplicates in our Matched file in order to have the most accurate data. 9/27/2017 2:00-5:00 pm*Attempted to find a way to organize the duplicate matches. 9/28/2017 4:00-5:00 pm*Continued running through matched data in order to organize it effectively. 10/2/2017 2:00-5:00 pm*Talked to determine their cohorts and listed these Ed about next steps for the project. Practiced accessing the crunchbase database on SQL. Brushed up on SQL code. 10/3/2017 3:00-5:00 pm*Searched the wikidatabase for crunchbase investment information.
10/194/2016 2017 2:00-5:00 pm: Finished looking on *Pulled the remaining accelerator websites funding rounds table from SQL and wrote matched it with the steps on determining how companies that have received VC funding in order to manually locate the cohortsgather round dates.
10/206/2016 42017 3:00-65:00 pm: Met with Peter and Christy *Went through the matched data. Brainstormed ways to discuss get the possibility of creating a web crawler that will pull data from individual accelerator sitesdates for cohort companies going through accelerators.
10/2411/2016 2017 2:00-53:00 30 pm: Brainstormed with Albert and Julia about changes *Looked into using the WhoIs Parser in order to find when the category name for SBDE. Spoke to Ed about full scope of accelerator projectcompanies went through their accelerators.
10/2512/2016 42017 3:00-65:00 pm: Brainstormed with Shrey about different potential industry focuses within *Discovered that the Wayback Machine will not be a good option for finding when companies went through their accelerators, as well as different variables to search . Created a list of VCCompanies and their earliest round date. Included a column for the date they went through their accelerators and will fill it in terms when we find a good method of accelerators, startups, cohorts, etcfinding this date.
10/2616/2016 2017 2:00-53:00 30 pm: Began searching for more databases including lists of accelerators as well as some characteristics of those accelerators; Began searching for characteristics that identify accelerators *Continued working on sorting VCCompanies by their websitesearliest round date.
10/2717/2016 42017 3:00-65:00 pm: Continued searching for relevant lists of accelerators *Worked with Ben to find a solution to include on our pageproblem of data acquisition. Added some links that have high potential under the tab (Obtained from List of Accelerators or various Google searches)Finalized earliest round date for VCCompanies.
10/3118/2016 2017 2:00-5:00 pm: Began constructing a list of variables that clearly distinguish an accelerator on its website. This is *Updated our VC data with Ed's help in an effort to allow a crawler order to crawl through many Google searches increase the accuracy and identify acceleratorscompletion of our data.
1110/119/2016 42017 3:00-65:00 pm: Continued looking for variables that could identify accelerators from their websites*Organized all of our matched data and updated it in order to reflect the most recent SDC pull with Ed. Searched through numerous different websites of accelerators obtained from Matched Crunchbase data with our current databasescohort companies.
1110/220/2016 2017 2:00-43:00 30 pm: Continued combing through websites *Generated the new list of numerous accelerators, VCCompanies as well-known and other, in the hopes of finding identifying variablesas their earliest round dates.
1110/323/2016 42017 2:00-63:00 30 pm: Finalized my list of variables that could be used to distinguish *Worked on sorting out the websites of accelerators. Slightly re-arranged discrepancies in our list of accelerator databases in order of relevancematched data.
1110/724/2016 22017 3:00-5:00 pm: Began compiling the *Went through list of all VCCompanies and began adding respective accelerators. Created a new TextPad document in order to proceed with information from a new databaseVCPercentage table.
1110/825/2016 42017 2:00-65:00 pm: Worked with Shrey *Continued going through list of VCCompanies and Ben in order to compile all of our accelerator databases into one long list on Textpadadding accelerators.
1110/926/2016 22017 3:0030-5:00 30 pm: *Continued formulating a database for all going through list of VCCompanies and adding accelerators and all of the available info given. Will have this completed on Monday.
1110/1030/2016 42017 2:00-63:00 30 pm: Worked with Shrey and Peter in order *Finished adding all of the accelerators to develop the list of VCCompanies. Added a crawler for f6scolumn indicating whether or not the company went through two or more accelerators.
1110/1431/2016 22017 3:00-5:00 pm: *Began sorting compiling data in the Seed-DB database in an Excel documentcolumn for Date Company went through Accelerator.
11/151/2016 42017 2:00-64:00 pm: Conducted some Google searches in an attempt to find more accelerator databases. Began looking through Executive Orders searching *Finalized entering dates for keywordsY Combinator cohort companies.
11/162/2016 22017 4:00-5:00 30 pm: Completed searching through Executive Orders*Continued entering cohort company dates into Excel file.
11/176/2016 42017 2:00-64:00 pm: *Continued working on Google searches for state accelerator entering cohort company dates into Excel file. Began compiling a list. Looked through f6s of keywords for common words that can be used to distinguish accelerators once we have finalized the crawlerdemo day press releases.
11/217/2016 22017 3:00-5:00 pm: Randomly chose 10 accelerators from Excel list of accelerators on the RDP*Finished coming up with keywords for demo day crawler. Went through each website and listed Sent the steps that I took in order final list to determine whether or not the website belonged to an accelerator. Will continue extracting cohort information tomorrowPeter.
11/228/2016 42017 2:00-63:00 30 pm: Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. Worked with Peter in order *Spoke to build a tool that will search Ed and organized all of the HTMLs and attempt to identify each one as an accelerator as well as extract some basic informationour current data.
11/289/2016 22017 3:00-5:00 pm: Merged the F6S accelerator list with our other list, then posted it on the *Created a new project page. Learned process for accelerator data extraction from Edcalled Accelerator Data and listed all relevant files as well as descriptions.
11/2914/2016 42017 3:00-65:00 pm: Began process of collecting data from *Looked up URLs and decided whether or not the 20 accelerators that I am responsible forwebiste was relevant.
11/3015/2016 2017 2:00-5:00 pm: Continued collecting data *Created SQL database entitled "acceleratordata" and began creating tables from accelerators. Finished 15/20folder of All Relevant Files.
1211/116/2016 42017 3:00-65:00 pm: *Continued collecting data from accelerators. Finished original 20, picked up a new set of 20to input tables into SQL database.
1211/220/2016 2017 2:00-5:00 pm: Continued collecting data from accelerators. Finished next 20*Cleaned text files in order to import tables into SQL database.
1211/827/2016 12017 2:00-35:00 pm: Completed collecting data from accelerators for the semester*Worked with Peter to find and exclude irrelevant keywords on HTML pages. Began categorizing relevant demo day pages.
111/1828/2017 13:00-5:00 pm: Continued collecting data for accelerator project. Helped Catherine draft tweets for the McNair Center twitter account*Finished inputting tables of relevant files into SQL database.
111/2029/2017 12:00-35:00 pm: Continued collecting data *Went through accelerator HTML URLs. Spoke with Ed about going through HTMLs and classifying based on accelerators. Attended McNair Center team meetingoverall and specific relevance.
12/1/23/2017 13:00-5:00 pm: Began combing *Worked through accelerator list, determining which accelerators are still missing data links and documenting these in a TextPad file. Finished through #115classified pages based on whether or not they provided relevant information about startup timing.
112/254/2017 110:00-512:00 pm: *Continued looking running through accelerator listdemo day crawl URLs and scoring them based on relevance.
112/277/2017 1:00-34:00 30 pm: Continued going through accelerator list*Finalized scoring of demo day URLs for the original crawl. Left off on #226 with ShreyLast day of work for this semester.
1</20/2017 1:00-5:00 pm: Continued going through accelerator list. Finished through #440.onlyinclude>
2/1/===Spring 2017 1:00-5:00 pm: Finished going through the list of accelerators looking for incomplete files. Began completing the files that were not done.===
21/318/2017 1:00-35:00 pm: *Continued working on completing collecting data for accelerator filesproject. Helped Catherine draft tweets for the McNair Center twitter account.
21/620/2017 1:00-43:30 00 pm: Finished *Continued collecting data set of on accelerators. Began going through and making sure that all text files and cohort files are of the same format so Peter can easily pull the information. Left for 30 minutes for an interview from 2:30-3:00 pmAttended McNair Center team meeting.
21/823/2017 1:00-5:00 pm: *Began combing through accelerator list, determining which accelerators are still missing data and documenting these in a TextPad file. Finished formatting through #137. Spoke with Ed about project115.
21/1325/2017 1:00-5:00 pm: Completed formatting for all *Continued looking through accelerator text fileslist.
21/1527/2017 31:00-53:00 pm: Made copy of the completed data set*Continued going through accelerator list. Spoke to Ed about future steps to take for project including gathering founder data and obtaining the crunchbase apiLeft off on #226 with Shrey.
21/1720/2017 1:00-35:00 pm: Went *Continued going through final Excel spreadsheet for cohort informationaccelerator list. Still need to run the crawler one more time after the completion of the editing process. Found the application for the crunchbase api which will hopefully allow us to gain accessFinished through #440.
2/201/2017 1:00-5:00 pm: Filled out another application for Crunchbase research access; Found *Finished going through the first source list of accelerators looking for incomplete files. Began completing the incubator project on angelfiles that were not done.co, will hopefully work with Peter to make a crawler similar to f6s
2/223/2017 1:00-53:00 pm: Pulled data from SDC for Ed and normalized it. Learned how to use SDC and the normalizer*Continued working on completing accelerator files.
2/246/2017 1:00-34:00 30 pm: *Finished cleaning up data set of accelerators. Began going through and making sure that all text files and cohort files are of the same format so Peter can easily pull the cohort data information. Left for 30 minutes for Yan interview from 2:30-combinator on the Final Cohort Excel Spreadsheet3:00 pm.
2/278/2017 1:00-5:00 pm: Continued cleaning up the cohort data in the Excel file*Finished formatting through #137. Finished Cohort Number and YearSpoke with Ed about project.
32/113/2017 21:00-5:00 pm: Worked with Ben and Shrey to pull data from SDC *Completed formatting for all VC funded companies and normalized it to put it in an Excel documentaccelerator text files.
32/315/2017 13:00-25:30 00 pm: Worked with Ben *Made copy of the completed data set. Spoke to Ed about future steps to try take for project including gathering founder data and repeat down obtaining the VC data without it going too farcrunchbase api.
32/617/2017 1:00-43:00 pm: Worked with Shrey to finish cleaning the *Went through final Excel spreadsheet for cohort datainformation. It is ready Still need to be run through the matcher with Bencrawler one more time after the completion of the editing process. Found the application for the crunchbase api which will hopefully allow us to gain access.
32/820/2017 1:00-5:00 pm: Matched *Filled out another application for Crunchbase research access; Found the VC Data with first source for the list of Cohort Companies and got one list of all cohort companies that have received VC fundingincubator project on angel.co, will hopefully work with Peter to make a crawler similar to f6s
32/1022/2017 121:00-25:00 pm: Put a write-up on the top of the Accelerator wiki page detailing where we are in the project currently as well as what *Pulled data we have accumulated on from SDC for Ed and normalized it. Learned how to use SDC and the RDPnormalizer.
32/2024/2017 1:00-53:00 pm: Began gathering *Finished cleaning up the URLs of all accelerators in a TextPad file called Accelerator URLs. Participated in cohort data for Y-combinator on the SQL training sessionFinal Cohort Excel Spreadsheet.
32/2227/2017 1:00-5:00 pm: Made tables *Continued cleaning up the cohort data in Terminal for Accelerator companies matched with VC companies the Excel file. Finished Cohort Number and for Cohort DataYear.
3/271/2017 12:00-45:00 pm: Compiled *Worked with Ben and Shrey to pull data from SDC for all URLs of accelerator into a TextPad fileVC funded companies and normalized it to put it in an Excel document.
3/293/2017 1:00-52:00 30 pm: *Worked on the matched data with Ben. Next time I will run the RegEx code that will filter the URLs, to try and I will look through repeat down the duplicates where two different VC backed company names matched to one cohort company namedata without it going too far.
3/316/2017 1:00-24:00 pm: Ran *Worked with Shrey to finish cleaning the code for accelerator urls which are cohort data. It is ready to be run through the wayback machine in order to get the start dates. Also began looking through vc backed company namesmatcher with Ben.
43/38/2017 1:00-5:00 pm: Continued looking through double matched *Matched the VC Data with the list of Cohort Companies and got one list of all cohort companies. Learned more SQL from Edthat have received VC funding.
43/510/2017 112:00-52:00 pm: Made *Put a write-up on the top of the Accelerator wiki page detailing where we are in the final vc percentage table project currently as well as what data we have accumulated on terminal and for next time I will collect missing accelerator datathe RDP.
43/720/2017 1:00-35:00 pm: *Began collecting cohort data for big gathering the URLs of all accelerators that were missing from our list in order to add it to our final list of cohort companiesa TextPad file called Accelerator URLs. Participated in the SQL training session.
43/1022/2017 1:00-5:00 pm: Finished gathering cohort company names *Made tables in Terminal for big accelerators that we were missing Accelerator companies matched with VC companies and put them into the Cleaned for Cohort Companies Excel file. Ben is looking through Crunchbase data in order to possibly find more missing acceleratorsData.
43/1427/2017 1:00-4:00 pm: Began working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes on the ones that I was able to go through. Need to finish this textpad before moving forward*Compiled all URLs of accelerator into a TextPad file.
43/1729/2017 1:00-45:00 pm: Continued going through potential Crunchbase accelerators that we may have missed*Worked on the matched data with Ben. Talked to Ed about getting a more comprehensive list from Excel file and by Next time I will run the end of RegEx code that will filter the semester have URLs, and I will look through the tables and data collected and doneduplicates where two different VC backed company names matched to one cohort company name.
43/1931/2017 1:00-42:00 pm: Worked with Jeemin *Ran the code for accelerator urls which are ready to be run through the wayback machine in order to generate an entire list of potential US accelerators from crunchbaseget the start dates. Worked to find a way to classify accelerators just based on their descriptionsAlso began looking through vc backed company names.
4/213/2017: 1:00-45:00 pm: *Continued working looking through the list identifying accelerators that we do not havedouble matched VC companies. Ramee and Juliette are now helping us gather cohort data for those missing acceleratorsLearned more SQL from Ed.
4/245/2017 91:00-15:00 pm: Updated Veeral *Made the final vc percentage table on current state of project. Typed up a to-do list on the discussion wiki terminal and for Veeral. Got new cohort next time I will collect missing accelerator data on an accelerator and added it to Excel file.
54/37/2017 111:00-13:00 pm: Talked *Began collecting cohort data for big accelerators that were missing from our list in order to add it to Ed and Anne about future report. Continued working through our final list of crunchbase potential accelerators. Last day of work for this semestercohort companies.
94/1110/2017 21:00-5:00 pm: Spoke to Ed about *Finished gathering cohort company names for big accelerators that we were missing and put them into the project going forwardCleaned Cohort Companies Excel file. Organized the current updated Ben is looking through Crunchbase data for our projectin order to possibly find more missing accelerators.
94/1214/2017 31:00-54:00 pm: *Began going working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes on the Cleaned Cohort Data Excel file and found a few problems with itones that I was able to go through. Will continue the cleaning process for the rest of the weekNeed to finish this textpad before moving forward.
94/1317/2017 21:00-54:00 pm: Sorted *Continued going through Cleaned Cohort Data potential Crunchbase accelerators that we may have missed. Talked to Ed about getting a more comprehensive list from Excel file and finalized our List by the end of Accelerators. We can begin the process of creating our PercentVC tablesemester have the tables and data collected and done.
94/1419/2017 31:00-54:00 pm: Completely finalized our dataset *Worked with Jeemin to generate an entire list of potential US accelerators and startupsfrom crunchbase. Met with Michelle Passo Worked to find a way to discuss objectives of the research for credit courseclassify accelerators just based on their descriptions.
94/1821/2017 2: 1:00-4:00 pm: Talked with Peter about *Continued working through the LinkedIn crawler list identifying accelerators that we do not have. Ramee and Juliette are now helping us gather cohort data. Went through VC page that Meghana sent mefor those missing accelerators.
94/1924/2017 39:00-51:00 pm: Completed SDC pull *Updated Veeral on current state of updated VC Dataproject. Typed up a to-do list on the discussion wiki for Veeral. Got new cohort data on an accelerator and added it to Excel file.
95/203/2017 211:00-51:00 pm: Attempted several times *Talked to run the MatcherEd and Anne about future report. Continued working through list of crunchbase potential accelerators. Cleaned our pulled dataLast day of work for this semester.
9/21/2017 3:00-5:00 pm: Came extremely close to running the Matcher the correctly. Reviewed the final LinkedIn data from Peter.===Fall 2016===
910/2517/2017 2016 2:00-5:00 pm: Finalized *Created personal wiki page as well as work log; Read about the matched file research project to which I have been assigned; Wrote a short summary of accelerator companies with VC portfolio companies. Gave Ben the data on Georgia accelerators.what I believe it is and included some helpful links
910/2618/2017 32016 4:00-56:00 pm: Worked *Met with research partner Shrey who filled me in on finding where we are with the duplicates in our Matched file in order project; Began looking on websites of certain accelerators for how to have determine their cohorts and listed these steps on the most accurate data.wiki
910/2719/2017 2016 2:00-5:00 pm: Attempted to find a way *Finished looking on the remaining accelerator websites and wrote the steps on determining how to organize manually locate the duplicate matchescohorts.
910/2820/2017 2016 4:00-56:00 pm: Continued running through matched *Met with Peter and Christy to discuss the possibility of creating a web crawler that will pull data in order to organize it effectivelyfrom individual accelerator sites.
10/224/2017 2016 2:00-5:00 pm: Talked *Brainstormed with Albert and Julia about changes to the category name for SBDE. Spoke to Ed about next steps for the full scope of accelerator project. Practiced accessing the crunchbase database on SQL. Brushed up on SQL code.
10/325/2017 32016 4:00-56:00 pm: Searched the database *Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables to search for crunchbase investment informationin terms of accelerators, startups, cohorts, etc.
10/426/2017 2016 2:00-5:00 pm: Pulled the funding rounds table from SQL and matched it with the companies *Began searching for more databases including lists of accelerators as well as some characteristics of those accelerators; Began searching for characteristics that have received VC funding in order to gather round dates.identify accelerators on their websites
10/627/2017 32016 4:00-56:00 pm: Went through the matched data*Continued searching for relevant lists of accelerators to include on our page. Brainstormed ways to get Added some links that have high potential under the dates for cohort companies going through acceleratorstab (Obtained from List of Accelerators or various Google searches).
10/1131/2017 2016 2:00-35:30 00 pm: Looked into using the WhoIs Parser *Began constructing a list of variables that clearly distinguish an accelerator on its website. This is in order an effort to allow a crawler to find when the companies went crawl through their many Google searches and identify accelerators.
1011/121/2017 32016 4:00-56:00 pm: Discovered *Continued looking for variables that the Wayback Machine will not be a good option for finding when companies went through their could identify accelerators. Created a list of VCCompanies and from their earliest round datewebsites. Included a column for the date they went Searched through their numerous different websites of accelerators and will fill it in when we find a good method of finding this dateobtained from our current databases.
1011/162/2017 2016 2:00-34:30 00 pm: *Continued working on sorting VCCompanies by their earliest round datecombing through websites of numerous accelerators, well-known and other, in the hopes of finding identifying variables.
1011/173/2017 32016 4:00-56:00 pm: Worked with Ben to find a solution *Finalized my list of variables that could be used to distinguish the websites of accelerators. Slightly re-arranged our problem list of accelerator databases in order of data acquisition. Finalized earliest round date for VCCompaniesrelevance.
1011/187/2017 2016 2:00-5:00 pm: Updated our VC data with Ed's help in order to increase *Began compiling the accuracy and completion list of our dataall accelerators. Created a new TextPad document with information from a new database.
1011/198/2017 32016 4:00-56:00 pm: Organized all of our matched data *Worked with Shrey and updated it Ben in order to reflect the most recent SDC pull with Ed. Matched Crunchbase data with compile all of our cohort companiesaccelerator databases into one long list on Textpad.
1011/209/2017 2016 2:00-35:30 00 pm: Generated *Continued formulating a database for all accelerators and all of the new list of VCCompanies as well as their earliest round datesavailable info given.
11/10/23/2017 22016 4:00-36:30 00 pm: *Worked on sorting out the discrepancies with Shrey and Peter in our matched dataorder to develop a crawler for f6s.
1011/2414/2017 32016 2:00-5:00 pm: Went through list of VCCompanies and began adding respective accelerators *Began sorting the Seed-DB database in order to proceed with VCPercentage tablean Excel document.
1011/2515/2017 22016 4:00-56:00 pm: Continued going *Conducted some Google searches in an attempt to find more accelerator databases. Began looking through list of VCCompanies and adding acceleratorsExecutive Orders searching for keywords.
1011/2616/2017 32016 2:3000-5:30 00 pm: Continued going *Completed searching through list of VCCompanies and adding accelerators. Will have this completed on MondayExecutive Orders.
1011/3017/2017 22016 4:00-36:30 00 pm: Finished adding all of the accelerators to the *Continued working on Google searches for state accelerator list of VCCompanies. Added a column indicating whether or not the company went Looked through two or more f6s for common words that can be used to distinguish acceleratorsonce we have finalized the crawler.
1011/3121/2017 32016 2:00-5:00 pm: Began compiling data *Randomly chose 10 accelerators from Excel list of accelerators on the RDP. Went through each website and listed the steps that I took in order to determine whether or not the column for Date Company went through Acceleratorwebsite belonged to an accelerator. Will continue extracting cohort information tomorrow.
11/22/2016 4:00-6:00 pm*Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. Worked with Peter in order to build a tool that will search all of the HTMLs and attempt to identify each one as an accelerator as well as extract some basic information. 11/28/2016 2:00-5:00 pm*Merged the F6S accelerator list with our other list, then posted it on the project page. Learned process for accelerator data extraction from Ed. 11/29/2016 4:00-6:00 pm*Began process of collecting data from the 20 accelerators that I am responsible for. 11/30/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished 15/20. 12/1/2017 2016 4:00-6:00 pm*Continued collecting data from accelerators. Finished original 20, picked up a new set of 20. 12/2/2016 2:00-45:00 pm*Continued collecting data from accelerators. Finished next 20. 12/8/2016 1:00-3: Finalized entering dates 00 pm*Completed collecting data from accelerators for Y Combinator cohort companiesthe semester.
[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]]
[[Category:Work Log]]

Navigation menu