Changes

Jump to navigation Jump to search
no edit summary
10/17/2016 2:00-5:00 pm: Created personal wiki page as well as work log; Read about the research project to which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links===Fall 2017===<onlyinclude>
10[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]] 9/11/2017 2:00-5:00 pm*Spoke to Ed about the project going forward. Organized the current updated data for our project. 9/12/2017 3:00-5:00 pm*Began going through the Cleaned Cohort Data Excel file and found a few problems with it. Will continue the cleaning process for the rest of the week. 9/1813/2016 42017 2:00-65:00 pm*Sorted through Cleaned Cohort Data and finalized our List of Accelerators. We can begin the process of creating our PercentVC table. 9/14/2017 3: 00-5:00 pm*Completely finalized our dataset of accelerators and startups. Met with Michelle Passo to discuss objectives of the research partner Shrey who filled for credit course. 9/18/2017 2:00-4:00 pm*Talked with Peter about the LinkedIn crawler data. Went through VC page that Meghana sent me . 9/19/2017 3:00-5:00 pm*Completed SDC pull of updated VC Data. 9/20/2017 2:00-5:00 pm*Attempted several times to run the Matcher. Cleaned our pulled data. 9/21/2017 3:00-5:00 pm*Came extremely close to running the Matcher the correctly. Reviewed the final LinkedIn data from Peter. 9/25/2017 2:00-5:00 pm*Finalized the matched file of accelerator companies with VC portfolio companies. Gave Ben the data on Georgia accelerators. 9/26/2017 3:00-5:00 pm*Worked on finding the duplicates in our Matched file in order to have the most accurate data. 9/27/2017 2:00-5:00 pm*Attempted to find a way to organize the duplicate matches. 9/28/2017 4:00-5:00 pm*Continued running through matched data in order to organize it effectively. 10/2/2017 2:00-5:00 pm*Talked to Ed about next steps for the project. Practiced accessing the crunchbase database on where SQL. Brushed up on SQL code. 10/3/2017 3:00-5:00 pm*Searched the database for crunchbase investment information. 10/4/2017 2:00-5:00 pm*Pulled the funding rounds table from SQL and matched it with the companies that have received VC funding in order to gather round dates. 10/6/2017 3:00-5:00 pm*Went through the matched data. Brainstormed ways to get the dates for cohort companies going through accelerators. 10/11/2017 2:00-3:30 pm:*Looked into using the WhoIs Parser in order to find when the companies went through their accelerators. 10/12/2017 3:00-5:00 pm*Discovered that the Wayback Machine will not be a good option for finding when companies went through their accelerators. Created a list of VCCompanies and their earliest round date. Included a column for the date they went through their accelerators and will fill it in when we are find a good method of finding this date. 10/16/2017 2:00-3:30 pm*Continued working on sorting VCCompanies by their earliest round date. 10/17/2017 3:00-5:00 pm*Worked with Ben to find a solution to our problem of data acquisition. Finalized earliest round date for VCCompanies. 10/18/2017 2:00-5:00 pm*Updated our VC data with Ed's help in order to increase the accuracy and completion of our data. 10/19/2017 3:00-5:00 pm*Organized all of our matched data and updated it in order to reflect the most recent SDC pull with Ed. Matched Crunchbase data with our cohort companies. 10/20/2017 2:00-3:30 pm*Generated the project; Began looking new list of VCCompanies as well as their earliest round dates. 10/23/2017 2:00-3:30 pm*Worked on websites sorting out the discrepancies in our matched data. 10/24/2017 3:00-5:00 pm*Went through list of certain VCCompanies and began adding respective accelerators for how in order to determine their cohorts proceed with VCPercentage table. 10/25/2017 2:00-5:00 pm*Continued going through list of VCCompanies and adding accelerators. 10/26/2017 3:30-5:30 pm*Continued going through list of VCCompanies and listed these steps adding accelerators. Will have this completed on the wikiMonday.
10/1930/2016 2017 2:00-53:00 30 pm: *Finished looking on adding all of the remaining accelerator websites and wrote accelerators to the steps on determining how to manually locate list of VCCompanies. Added a column indicating whether or not the cohortscompany went through two or more accelerators.
10/2031/2016 42017 3:00-65:00 pm: Met with Peter and Christy to discuss *Began compiling data in the possibility of creating a web crawler that will pull data from individual accelerator sitescolumn for Date Company went through Accelerator.
1011/241/2016 2017 2:00-54:00 pm: Brainstormed with Albert and Julia about changes to the category name *Finalized entering dates for SBDE. Spoke to Ed about full scope of accelerator projectY Combinator cohort companies.
1011/252/2016 2017 4:00-65:00 30 pm: Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables to search for in terms of accelerators, startups, cohorts, etc*Continued entering cohort company dates into Excel file.
1011/266/2016 2017 2:00-54:00 pm: *Continued entering cohort company dates into Excel file. Began searching for more databases including lists of accelerators as well as some characteristics compiling a list of those accelerators; Began searching keywords for characteristics that identify accelerators on their websitesdemo day press releases.
1011/277/2016 42017 3:00-65:00 pm: Continued searching *Finished coming up with keywords for relevant lists of accelerators to include on our pagedemo day crawler. Added some links that have high potential under Sent the tab (Obtained from List of Accelerators or various Google searches)final list to Peter.
1011/318/2016 2017 2:00-53:00 30 pm: Began constructing a list of variables that clearly distinguish an accelerator on its website. This is in an effort *Spoke to allow a crawler to crawl through many Google searches Ed and identify acceleratorsorganized all of our current data.
11/19/2016 42017 3:00-65:00 pm: Continued looking for variables that could identify accelerators from their websites. Searched through numerous different websites of accelerators obtained from our current databases*Created a new project page called Accelerator Data and listed all relevant files as well as descriptions.
11/214/2016 22017 3:00-45:00 pm: Continued combing through websites of numerous accelerators, well-known *Looked up URLs and other, in decided whether or not the hopes of finding identifying variableswebiste was relevant.
11/315/2016 42017 2:00-65:00 pm: Finalized my list of variables that could be used to distinguish the websites of accelerators. Slightly re-arranged our list of accelerator databases in order *Created SQL database entitled "acceleratordata" and began creating tables from folder of relevanceAll Relevant Files.
11/716/2016 22017 3:00-5:00 pm: Began compiling the list of all accelerators. Created a new TextPad document with information from a new *Continued to input tables into SQL database.
11/820/2016 42017 2:00-65:00 pm: Worked with Shrey and Ben *Cleaned text files in order to compile all of our accelerator databases import tables into one long list on TextpadSQL database.
11/927/2016 2017 2:00-5:00 pm: Continued formulating a database for all accelerators *Worked with Peter to find and all of the available info givenexclude irrelevant keywords on HTML pages. Began categorizing relevant demo day pages.
11/1028/2016 42017 3:00-65:00 pm: Worked with Shrey and Peter in order to develop a crawler for f6s*Finished inputting tables of relevant files into SQL database.
11/1429/2016 2017 2:00-5:00 pm: Began sorting the Seed-DB database in an Excel document*Went through accelerator HTML URLs. Spoke with Ed about going through HTMLs and classifying based on overall and specific relevance.
1112/151/2016 42017 3:00-65:00 pm: Conducted some Google searches in an attempt to find more *Worked through accelerator databases. Began looking through Executive Orders searching for keywordslinks and classified pages based on whether or not they provided relevant information about startup timing.
1112/164/2016 22017 10:00-512:00 pm: Completed searching *Continued running through Executive Ordersdemo day crawl URLs and scoring them based on relevance.
1112/177/2016 42017 1:00-64:00 30 pm: Continued working on Google searches *Finalized scoring of demo day URLs for state accelerator listthe original crawl. Looked through f6s Last day of work for common words that can be used to distinguish accelerators once we have finalized the crawlerthis semester.
11</21/2016 2:00-5:00 pm: Randomly chose 10 accelerators from Excel list of accelerators on the RDP. Went through each website and listed the steps that I took in order to determine whether or not the website belonged to an accelerator. Will continue extracting cohort information tomorrow.onlyinclude>
11/22/2016 4:00-6:00 pm: Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. Worked with Peter in order to build a tool that will search all of the HTMLs and attempt to identify each one as an accelerator as well as extract some basic information.===Spring 2017===
111/2818/2016 22017 1:00-5:00 pm: Merged the F6S *Continued collecting data for accelerator list with our other list, then posted it on the project page. Learned process Helped Catherine draft tweets for accelerator data extraction from Edthe McNair Center twitter account.
111/2920/2016 42017 1:00-63:00 pm: Began process of *Continued collecting data from the 20 on accelerators that I am responsible for. Attended McNair Center team meeting.
111/3023/2016 22017 1:00-5:00 pm: Continued collecting *Began combing through accelerator list, determining which accelerators are still missing data from acceleratorsand documenting these in a TextPad file. Finished 15/20through #115.
121/25/2017 1/2016 4:00-65:00 pm: *Continued collecting data from accelerators. Finished original 20, picked up a new set of 20looking through accelerator list.
121/227/2016 22017 1:00-53:00 pm: *Continued collecting data from acceleratorsgoing through accelerator list. Finished next 20Left off on #226 with Shrey.
121/820/2016 2017 1:00-35:00 pm: Completed collecting data from accelerators for the semester*Continued going through accelerator list. Finished through #440.
2/1/18/2017 1:00-5:00 pm: Continued collecting data *Finished going through the list of accelerators looking for accelerator projectincomplete files. Helped Catherine draft tweets for Began completing the McNair Center twitter accountfiles that were not done.
12/203/2017 1:00-3:00 pm: *Continued collecting data working on accelerators. Attended McNair Center team meetingcompleting accelerator files.
12/236/2017 1:00-54:00 30 pm: *Finished data set of accelerators. Began combing going through accelerator list, determining which accelerators and making sure that all text files and cohort files are still missing data and documenting these in a TextPad fileof the same format so Peter can easily pull the information. Finished through #115Left for 30 minutes for an interview from 2:30-3:00 pm.
12/258/2017 1:00-5:00 pm: Continued looking *Finished formatting through accelerator list#137. Spoke with Ed about project.
12/2713/2017 1:00-35:00 pm: Continued going through *Completed formatting for all accelerator list. Left off on #226 with Shreytext files.
12/2015/2017 13:00-5:00 pm: Continued going through accelerator list*Made copy of the completed data set. Finished through #440Spoke to Ed about future steps to take for project including gathering founder data and obtaining the crunchbase api.
2/117/2017 1:00-53:00 pm: Finished going *Went through final Excel spreadsheet for cohort information. Still need to run the list crawler one more time after the completion of accelerators looking the editing process. Found the application for incomplete files. Began completing the files that were not donecrunchbase api which will hopefully allow us to gain access.
2/320/2017 1:00-35:00 pm: Continued working *Filled out another application for Crunchbase research access; Found the first source for the incubator project on completing accelerator filesangel.co, will hopefully work with Peter to make a crawler similar to f6s
2/622/2017 1:00-45:30 00 pm: Finished *Pulled data set of acceleratorsfrom SDC for Ed and normalized it. Began going through and making sure that all text files Learned how to use SDC and cohort files are of the same format so Peter can easily pull the information. Left for 30 minutes for an interview from 2:30-3:00 pmnormalizer.
2/824/2017 1:00-53:00 pm: *Finished formatting through #137. Spoke with Ed about projectcleaning up the cohort data for Y-combinator on the Final Cohort Excel Spreadsheet.
2/1327/2017 1:00-5:00 pm: Completed formatting for all accelerator text files*Continued cleaning up the cohort data in the Excel file. Finished Cohort Number and Year.
23/151/2017 32:00-5:00 pm: Made copy of the completed *Worked with Ben and Shrey to pull data set. Spoke to Ed about future steps to take from SDC for project including gathering founder data all VC funded companies and obtaining the crunchbase apinormalized it to put it in an Excel document.
23/173/2017 1:00-32:00 30 pm: Went through final Excel spreadsheet for cohort information. Still need *Worked with Ben to run the crawler one more time after the completion of the editing process. Found try and repeat down the application for the crunchbase api which will hopefully allow us to gain accessVC data without it going too far.
23/206/2017 1:00-54:00 pm: Filled out another application for Crunchbase research access; Found *Worked with Shrey to finish cleaning the first source for cohort data. It is ready to be run through the incubator project on angelmatcher with Ben.co, will hopefully work with Peter to make a crawler similar to f6s
23/228/2017 1:00-5:00 pm: Pulled data from SDC for Ed *Matched the VC Data with the list of Cohort Companies and normalized it. Learned how to use SDC and the normalizergot one list of all cohort companies that have received VC funding.
23/2410/2017 112:00-32:00 pm: Finished cleaning *Put a write-up on the cohort top of the Accelerator wiki page detailing where we are in the project currently as well as what data for Y-combinator we have accumulated on the Final Cohort Excel SpreadsheetRDP.
23/2720/2017 1:00-5:00 pm: Continued cleaning up *Began gathering the cohort data URLs of all accelerators in the Excel a TextPad filecalled Accelerator URLs. Finished Cohort Number and YearParticipated in the SQL training session.
3/122/2017 21:00-5:00 pm: Worked *Made tables in Terminal for Accelerator companies matched with Ben and Shrey to pull data from SDC for all VC funded companies and normalized it to put it in an Excel documentfor Cohort Data.
3/327/2017 1:00-24:30 00 pm: Worked with Ben to try and repeat down the VC data without it going too far*Compiled all URLs of accelerator into a TextPad file.
3/629/2017 1:00-45:00 pm: *Worked with Shrey to finish cleaning on the cohort matched datawith Ben. It is ready to be Next time I will run the RegEx code that will filter the URLs, and I will look through the matcher with Benduplicates where two different VC backed company names matched to one cohort company name.
3/831/2017 1:00-52:00 pm: Matched *Ran the code for accelerator urls which are ready to be run through the VC Data with wayback machine in order to get the list of Cohort Companies and got one list of all cohort companies that have received VC fundingstart dates. Also began looking through vc backed company names.
4/3/10/2017 121:00-25:00 pm: Put a write-up on the top of the Accelerator wiki page detailing where we are in the project currently as well as what data we have accumulated on the RDP*Continued looking through double matched VC companies. Learned more SQL from Ed.
34/205/2017 1:00-5:00 pm: Began gathering the URLs of all accelerators in a TextPad file called Accelerator URLs. Participated in *Made the SQL training sessionfinal vc percentage table on terminal and for next time I will collect missing accelerator data.
34/227/2017 1:00-53:00 pm: Made tables *Began collecting cohort data for big accelerators that were missing from our list in Terminal for Accelerator order to add it to our final list of cohort companies matched with VC companies and for Cohort Data.
34/2710/2017 1:00-45:00 pm: Compiled all URLs of accelerator *Finished gathering cohort company names for big accelerators that we were missing and put them into a TextPad the Cleaned Cohort Companies Excel file. Ben is looking through Crunchbase data in order to possibly find more missing accelerators.
34/2914/2017 1:00-54:00 pm: Worked *Began working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes on the matched data with Ben. Next time I will run the RegEx code ones that will filter the URLs, and I will look was able to go through the duplicates where two different VC backed company names matched . Need to one cohort company namefinish this textpad before moving forward.
34/3117/2017 1:00-24:00 pm: Ran *Continued going through potential Crunchbase accelerators that we may have missed. Talked to Ed about getting a more comprehensive list from Excel file and by the code for accelerator urls which are ready to be run through end of the wayback machine in order to get semester have the start dates. Also began looking through vc backed company namestables and data collected and done.
4/319/2017 1:00-54:00 pm: Continued looking through double matched VC companies*Worked with Jeemin to generate an entire list of potential US accelerators from crunchbase. Learned more SQL from EdWorked to find a way to classify accelerators just based on their descriptions.
4/521/2017 : 1:00-54:00 pm: Made *Continued working through the final vc percentage table on terminal list identifying accelerators that we do not have. Ramee and Juliette are now helping us gather cohort data for next time I will collect those missing accelerator dataaccelerators.
4/724/2017 19:00-31:00 pm: Began collecting *Updated Veeral on current state of project. Typed up a to-do list on the discussion wiki for Veeral. Got new cohort data for big accelerators that were missing from our list in order to add on an accelerator and added it to our final list of cohort companiesExcel file.
45/103/2017 111:00-51:00 pm: Finished gathering cohort company names for big accelerators that we were missing *Talked to Ed and put them into the Cleaned Cohort Companies Excel fileAnne about future report. Ben is looking Continued working through Crunchbase data in order to possibly find more missing list of crunchbase potential accelerators. Last day of work for this semester.
4/14/2017 1:00-4:00 pm: Began working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes on the ones that I was able to go through. Need to finish this textpad before moving forward.===Fall 2016===
410/17/2017 12016 2:00-45:00 pm: Continued going through potential Crunchbase accelerators that we may *Created personal wiki page as well as work log; Read about the research project to which I have missed. Talked to Ed about getting been assigned; Wrote a more comprehensive list from Excel file and by the end short summary of the semester have the tables and data collected what I believe it is and done.included some helpful links
410/1918/2017 12016 4:00-46:00 pm: Worked *Met with research partner Shrey who filled me in on where we are with Jeemin to generate an entire list the project; Began looking on websites of potential US certain accelerators from crunchbase. Worked for how to find a way to classify accelerators just based determine their cohorts and listed these steps on their descriptions.the wiki
410/2119/2017: 12016 2:00-45:00 pm: Continued working through *Finished looking on the list identifying accelerators that we do not have. Ramee remaining accelerator websites and Juliette are now helping us gather cohort data for those missing acceleratorswrote the steps on determining how to manually locate the cohorts.
410/2420/2017 92016 4:00-16:00 pm: Updated Veeral on current state *Met with Peter and Christy to discuss the possibility of project. Typed up creating a to-do list on the discussion wiki for Veeral. Got new cohort web crawler that will pull data on an from individual accelerator and added it to Excel filesites.
510/324/2017 112016 2:00-15:00 pm: Talked *Brainstormed with Albert and Julia about changes to the category name for SBDE. Spoke to Ed and Anne about future report. Continued working through list of crunchbase potential accelerators. Last day full scope of work for this semesteraccelerator project.
910/1125/2017 22016 4:00-56:00 pm: Spoke *Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables to Ed about the project going forward. Organized the current updated data search for our projectin terms of accelerators, startups, cohorts, etc.
910/26/2016 2:00-5:00 pm*Began searching for more databases including lists of accelerators as well as some characteristics of those accelerators; Began searching for characteristics that identify accelerators on their websites 10/27/2016 4:00-6:00 pm*Continued searching for relevant lists of accelerators to include on our page. Added some links that have high potential under the tab (Obtained from List of Accelerators or various Google searches). 10/31/2016 2:00-5:00 pm*Began constructing a list of variables that clearly distinguish an accelerator on its website. This is in an effort to allow a crawler to crawl through many Google searches and identify accelerators. 11/1/2016 4:00-6:00 pm*Continued looking for variables that could identify accelerators from their websites. Searched through numerous different websites of accelerators obtained from our current databases. 11/2/122016 2:00-4:00 pm*Continued combing through websites of numerous accelerators, well-known and other, in the hopes of finding identifying variables. 11/2017 3/2016 4:00-6:00 pm*Finalized my list of variables that could be used to distinguish the websites of accelerators. Slightly re-arranged our list of accelerator databases in order of relevance. 11/7/2016 2:00-5:00 pm*Began compiling the list of all accelerators. Created a new TextPad document with information from a new database. 11/8/2016 4: 00-6:00 pm*Worked with Shrey and Ben in order to compile all of our accelerator databases into one long list on Textpad. 11/9/2016 2:00-5:00 pm*Continued formulating a database for all accelerators and all of the available info given. 11/10/2016 4:00-6:00 pm*Worked with Shrey and Peter in order to develop a crawler for f6s. 11/14/2016 2:00-5:00 pm*Began going sorting the Seed-DB database in an Excel document. 11/15/2016 4:00-6:00 pm*Conducted some Google searches in an attempt to find more accelerator databases. Began looking through Executive Orders searching for keywords. 11/16/2016 2:00-5:00 pm*Completed searching through Executive Orders. 11/17/2016 4:00-6:00 pm*Continued working on Google searches for state accelerator list. Looked through f6s for common words that can be used to distinguish accelerators once we have finalized the Cleaned Cohort Data crawler. 11/21/2016 2:00-5:00 pm*Randomly chose 10 accelerators from Excel file list of accelerators on the RDP. Went through each website and found listed the steps that I took in order to determine whether or not the website belonged to an accelerator. Will continue extracting cohort information tomorrow. 11/22/2016 4:00-6:00 pm*Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. Worked with Peter in order to build a few problems tool that will search all of the HTMLs and attempt to identify each one as an accelerator as well as extract some basic information. 11/28/2016 2:00-5:00 pm*Merged the F6S accelerator list with our other list, then posted iton the project page. Will continue the cleaning Learned process for accelerator data extraction from Ed. 11/29/2016 4:00-6:00 pm*Began process of collecting data from the rest 20 accelerators that I am responsible for. 11/30/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished 15/20. 12/1/2016 4:00-6:00 pm*Continued collecting data from accelerators. Finished original 20, picked up a new set of 20. 12/2/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished next 20. 12/8/2016 1:00-3:00 pm*Completed collecting data from accelerators for the weeksemester.
[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]]
[[Category:Work Log]]

Navigation menu