Changes

Jump to navigation Jump to search
no edit summary
10/17/2016 2:00-5:00 pm: Created personal wiki page as well as work log; Read about the research project to which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links===Fall 2017===<onlyinclude>
10[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]] 9/11/2017 2:00-5:00 pm*Spoke to Ed about the project going forward. Organized the current updated data for our project. 9/12/2017 3:00-5:00 pm*Began going through the Cleaned Cohort Data Excel file and found a few problems with it. Will continue the cleaning process for the rest of the week. 9/1813/2016 42017 2:00-65:00 pm*Sorted through Cleaned Cohort Data and finalized our List of Accelerators. We can begin the process of creating our PercentVC table. 9/14/2017 3:00-5: 00 pm*Completely finalized our dataset of accelerators and startups. Met with Michelle Passo to discuss objectives of the research partner Shrey who filled for credit course. 9/18/2017 2:00-4:00 pm*Talked with Peter about the LinkedIn crawler data. Went through VC page that Meghana sent me . 9/19/2017 3:00-5:00 pm*Completed SDC pull of updated VC Data. 9/20/2017 2:00-5:00 pm*Attempted several times to run the Matcher. Cleaned our pulled data. 9/21/2017 3:00-5:00 pm*Came extremely close to running the Matcher the correctly. Reviewed the final LinkedIn data from Peter. 9/25/2017 2:00-5:00 pm*Finalized the matched file of accelerator companies with VC portfolio companies. Gave Ben the data on Georgia accelerators. 9/26/2017 3:00-5:00 pm*Worked on finding the duplicates in our Matched file in order to have the most accurate data. 9/27/2017 2:00-5:00 pm*Attempted to find a way to organize the duplicate matches. 9/28/2017 4:00-5:00 pm*Continued running through matched data in order to organize it effectively. 10/2/2017 2:00-5:00 pm*Talked to Ed about next steps for the project. Practiced accessing the crunchbase database on SQL. Brushed up on where SQL code. 10/3/2017 3:00-5:00 pm*Searched the database for crunchbase investment information. 10/4/2017 2:00-5:00 pm*Pulled the funding rounds table from SQL and matched it with the companies that have received VC funding in order to gather round dates. 10/6/2017 3:00-5:00 pm*Went through the matched data. Brainstormed ways to get the dates for cohort companies going through accelerators. 10/11/2017 2:00-3:30 pm:*Looked into using the WhoIs Parser in order to find when the companies went through their accelerators. 10/12/2017 3:00-5:00 pm*Discovered that the Wayback Machine will not be a good option for finding when companies went through their accelerators. Created a list of VCCompanies and their earliest round date. Included a column for the date they went through their accelerators and will fill it in when we are find a good method of finding this date. 10/16/2017 2:00-3:30 pm*Continued working on sorting VCCompanies by their earliest round date. 10/17/2017 3:00-5:00 pm*Worked with Ben to find a solution to our problem of data acquisition. Finalized earliest round date for VCCompanies. 10/18/2017 2:00-5:00 pm*Updated our VC data with Ed's help in order to increase the accuracy and completion of our data. 10/19/2017 3:00-5:00 pm*Organized all of our matched data and updated it in order to reflect the most recent SDC pull with Ed. Matched Crunchbase data with our cohort companies. 10/20/2017 2:00-3:30 pm*Generated the new list of VCCompanies as well as their earliest round dates. 10/23/2017 2:00-3:30 pm*Worked on sorting out the project; Began looking discrepancies in our matched data. 10/24/2017 3:00-5:00 pm*Went through list of VCCompanies and began adding respective accelerators in order to proceed with VCPercentage table. 10/25/2017 2:00-5:00 pm*Continued going through list of VCCompanies and adding accelerators. 10/26/2017 3:30-5:30 pm*Continued going through list of VCCompanies and adding accelerators. Will have this completed on websites Monday. 10/30/2017 2:00-3:30 pm*Finished adding all of certain the accelerators to the list of VCCompanies. Added a column indicating whether or not the company went through two or more accelerators. 10/31/2017 3:00-5:00 pm*Began compiling data in the column for Date Company went through Accelerator. 11/1/2017 2:00-4:00 pm*Finalized entering dates for how Y Combinator cohort companies. 11/2/2017 4:00-5:30 pm*Continued entering cohort company dates into Excel file. 11/6/2017 2:00-4:00 pm*Continued entering cohort company dates into Excel file. Began compiling a list of keywords for demo day press releases. 11/7/2017 3:00-5:00 pm*Finished coming up with keywords for demo day crawler. Sent the final list to determine their cohorts Peter. 11/8/2017 2:00-3:30 pm*Spoke to Ed and organized all of our current data. 11/9/2017 3:00-5:00 pm*Created a new project page called Accelerator Data and listed these steps all relevant files as well as descriptions. 11/14/2017 3:00-5:00 pm*Looked up URLs and decided whether or not the webiste was relevant. 11/15/2017 2:00-5:00 pm*Created SQL database entitled "acceleratordata" and began creating tables from folder of All Relevant Files. 11/16/2017 3:00-5:00 pm*Continued to input tables into SQL database. 11/20/2017 2:00-5:00 pm*Cleaned text files in order to import tables into SQL database. 11/27/2017 2:00-5:00 pm*Worked with Peter to find and exclude irrelevant keywords on the wikiHTML pages. Began categorizing relevant demo day pages. 11/28/2017 3:00-5:00 pm*Finished inputting tables of relevant files into SQL database.
1011/1929/2016 2017 2:00-5:00 pm: Finished looking on the remaining *Went through accelerator websites HTML URLs. Spoke with Ed about going through HTMLs and wrote the steps classifying based on determining how to manually locate the cohortsoverall and specific relevance.
1012/201/2016 42017 3:00-65:00 pm: Met with Peter *Worked through accelerator links and Christy to discuss the possibility of creating a web crawler that will pull data from individual accelerator sitesclassified pages based on whether or not they provided relevant information about startup timing.
1012/244/2016 22017 10:00-512:00 pm: Brainstormed with Albert *Continued running through demo day crawl URLs and Julia about changes to the category name for SBDE. Spoke to Ed about full scope of accelerator projectscoring them based on relevance.
1012/257/2016 42017 1:00-64:00 30 pm: Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables to search *Finalized scoring of demo day URLs for in terms the original crawl. Last day of accelerators, startups, cohorts, etcwork for this semester.
10</26/2016 2:00-5:00 pm: Began searching for more databases including lists of accelerators as well as some characteristics of those accelerators; Began searching for characteristics that identify accelerators on their websitesonlyinclude>
10/27/2016 4:00-6:00 pm: Continued searching for relevant lists of accelerators to include on our page. Added some links that have high potential under the tab (Obtained from List of Accelerators or various Google searches).===Spring 2017===
101/3118/2016 22017 1:00-5:00 pm: Began constructing a list of variables that clearly distinguish an *Continued collecting data for accelerator on its websiteproject. This is in an effort to allow a crawler to crawl through many Google searches and identify acceleratorsHelped Catherine draft tweets for the McNair Center twitter account.
111/20/2017 1/2016 4:00-63:00 pm: *Continued looking for variables that could identify collecting data on accelerators from their websites. Searched through numerous different websites of accelerators obtained from our current databasesAttended McNair Center team meeting.
111/223/2016 22017 1:00-45:00 pm: Continued *Began combing through websites of numerous accelerator list, determining which accelerators, well-known are still missing data and other, documenting these in the hopes of finding identifying variablesa TextPad file. Finished through #115.
111/325/2016 42017 1:00-65:00 pm: Finalized my *Continued looking through accelerator list of variables that could be used to distinguish the websites of accelerators. Slightly re-arranged our list of accelerator databases in order of relevance.
111/727/2016 22017 1:00-53:00 pm: Began compiling the *Continued going through accelerator list of all accelerators. Created a new TextPad document Left off on #226 with information from a new databaseShrey.
111/820/2016 42017 1:00-65:00 pm: Worked with Shrey and Ben in order to compile all of our *Continued going through accelerator databases into one long list on Textpad. Finished through #440.
112/91/2016 22017 1:00-5:00 pm: Continued formulating a database *Finished going through the list of accelerators looking for all accelerators and all of incomplete files. Began completing the available info givenfiles that were not done.
112/103/2016 42017 1:00-63:00 pm: Worked with Shrey and Peter in order to develop a crawler for f6s*Continued working on completing accelerator files.
112/146/2016 22017 1:00-54:00 30 pm: *Finished data set of accelerators. Began sorting going through and making sure that all text files and cohort files are of the Seedsame format so Peter can easily pull the information. Left for 30 minutes for an interview from 2:30-DB database in an Excel document3:00 pm.
112/158/2016 42017 1:00-65:00 pm: Conducted some Google searches in an attempt to find more accelerator databases*Finished formatting through #137. Began looking through Executive Orders searching for keywordsSpoke with Ed about project.
112/1613/2016 22017 1:00-5:00 pm: *Completed searching through Executive Ordersformatting for all accelerator text files.
112/1715/2016 42017 3:00-65:00 pm: Continued working on Google searches for state accelerator list*Made copy of the completed data set. Looked through f6s Spoke to Ed about future steps to take for common words that can be used to distinguish accelerators once we have finalized project including gathering founder data and obtaining the crawlercrunchbase api.
112/2117/2016 22017 1:00-53:00 pm: Randomly chose 10 accelerators from *Went through final Excel list spreadsheet for cohort information. Still need to run the crawler one more time after the completion of accelerators on the RDPediting process. Went through each website and listed Found the steps that I took in order to determine whether or not application for the website belonged crunchbase api which will hopefully allow us to an accelerator. Will continue extracting cohort information tomorrowgain access.
112/2220/2016 42017 1:00-65:00 pm: Listed *Filled out all steps another application for Crunchbase research access; Found the first source for extracting cohort information from the ten randomly chosen acceleratorsincubator project on angel. Worked co, will hopefully work with Peter in order to build make a tool that will search all of the HTMLs and attempt crawler similar to identify each one as an accelerator as well as extract some basic information.f6s
112/2822/2016 22017 1:00-5:00 pm: Merged the F6S accelerator list with our other list, then posted *Pulled data from SDC for Ed and normalized it on the project page. Learned process for accelerator data extraction from Edhow to use SDC and the normalizer.
112/2924/2016 42017 1:00-63:00 pm: Began process of collecting *Finished cleaning up the cohort data from for Y-combinator on the 20 accelerators that I am responsible forFinal Cohort Excel Spreadsheet.
112/3027/2016 22017 1:00-5:00 pm: *Continued collecting cleaning up the cohort data from acceleratorsin the Excel file. Finished 15/20Cohort Number and Year.
123/1/2016 42017 2:00-65:00 pm: Continued collecting *Worked with Ben and Shrey to pull data from accelerators. Finished original 20, picked up a new set of 20SDC for all VC funded companies and normalized it to put it in an Excel document.
123/23/2016 22017 1:00-52:00 30 pm: Continued collecting *Worked with Ben to try and repeat down the VC data from accelerators. Finished next 20without it going too far.
123/86/2016 2017 1:00-34:00 pm: Completed collecting *Worked with Shrey to finish cleaning the cohort data from accelerators for . It is ready to be run through the semestermatcher with Ben.
13/188/2017 1:00-5:00 pm: Continued collecting data for accelerator project. Helped Catherine draft tweets for *Matched the VC Data with the McNair Center twitter accountlist of Cohort Companies and got one list of all cohort companies that have received VC funding.
13/2010/2017 112:00-32:00 pm: Continued collecting *Put a write-up on the top of the Accelerator wiki page detailing where we are in the project currently as well as what data we have accumulated on accelerators. Attended McNair Center team meetingthe RDP.
13/2320/2017 1:00-5:00 pm: *Began combing through accelerator list, determining which gathering the URLs of all accelerators are still missing data and documenting these in a TextPad filecalled Accelerator URLs. Finished through #115Participated in the SQL training session.
13/2522/2017 1:00-5:00 pm: Continued looking through accelerator list*Made tables in Terminal for Accelerator companies matched with VC companies and for Cohort Data.
13/27/2017 1:00-34:00 pm: Continued going through *Compiled all URLs of accelerator list. Left off on #226 with Shreyinto a TextPad file.
13/2029/2017 1:00-5:00 pm: Continued going through accelerator list*Worked on the matched data with Ben. Finished Next time I will run the RegEx code that will filter the URLs, and I will look through #440the duplicates where two different VC backed company names matched to one cohort company name.
23/131/2017 1:00-52:00 pm: Finished going *Ran the code for accelerator urls which are ready to be run through the list of accelerators wayback machine in order to get the start dates. Also began looking for incomplete files. Began completing the files that were not donethrough vc backed company names.
24/3/2017 1:00-35:00 pm: *Continued working on completing accelerator fileslooking through double matched VC companies. Learned more SQL from Ed.
24/65/2017 1:00-45:30 00 pm: Finished data set of accelerators. Began going through and making sure that all text files *Made the final vc percentage table on terminal and cohort files are of the same format so Peter can easily pull the information. Left for 30 minutes for an interview from 2:30-3:00 pmnext time I will collect missing accelerator data.
24/87/2017 1:00-53:00 pm: Finished formatting through #137. Spoke with Ed about project*Began collecting cohort data for big accelerators that were missing from our list in order to add it to our final list of cohort companies.
24/1310/2017 1:00-5:00 pm: Completed formatting *Finished gathering cohort company names for all accelerator text filesbig accelerators that we were missing and put them into the Cleaned Cohort Companies Excel file. Ben is looking through Crunchbase data in order to possibly find more missing accelerators.
24/1514/2017 31:00-54:00 pm: Made copy of *Began working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes on the completed data setones that I was able to go through. Spoke to Ed about future steps Need to take for project including gathering founder data and obtaining the crunchbase apifinish this textpad before moving forward.
24/17/2017 1:00-34:00 pm: Went *Continued going through final Excel spreadsheet for cohort informationpotential Crunchbase accelerators that we may have missed. Still need Talked to run the crawler one Ed about getting a more time after comprehensive list from Excel file and by the completion end of the editing process. Found semester have the application for the crunchbase api which will hopefully allow us to gain accesstables and data collected and done.
24/2019/2017 1:00-54:00 pm: Filled out another application for Crunchbase research access; Found the first source for the incubator project on angel*Worked with Jeemin to generate an entire list of potential US accelerators from crunchbase.co, will hopefully work with Peter Worked to make find a crawler similar way to f6sclassify accelerators just based on their descriptions.
24/2221/2017 : 1:00-54:00 pm: Pulled *Continued working through the list identifying accelerators that we do not have. Ramee and Juliette are now helping us gather cohort data from SDC for Ed and normalized it. Learned how to use SDC and the normalizerthose missing accelerators.
24/24/2017 19:00-31:00 pm: Finished cleaning *Updated Veeral on current state of project. Typed up a to-do list on the discussion wiki for Veeral. Got new cohort data for Y-combinator on the Final Cohort an accelerator and added it to Excel Spreadsheetfile.
25/273/2017 111:00-51:00 pm: *Talked to Ed and Anne about future report. Continued cleaning up the cohort data in the Excel fileworking through list of crunchbase potential accelerators. Finished Cohort Number and YearLast day of work for this semester.
3/1/2017 2:00-5:00 pm: Worked with Ben and Shrey to pull data from SDC for all VC funded companies and normalized it to put it in an Excel document.===Fall 2016===
310/317/2017 12016 2:00-25:30 00 pm: Worked with Ben *Created personal wiki page as well as work log; Read about the research project to try which I have been assigned; Wrote a short summary of what I believe it is and repeat down the VC data without it going too far.included some helpful links
310/618/2017 12016 4:00-46:00 pm: Worked *Met with research partner Shrey to finish cleaning who filled me in on where we are with the cohort data. It is ready project; Began looking on websites of certain accelerators for how to be run through determine their cohorts and listed these steps on the matcher with Ben.wiki
310/819/2017 12016 2:00-5:00 pm: Matched *Finished looking on the remaining accelerator websites and wrote the VC Data with steps on determining how to manually locate the list of Cohort Companies and got one list of all cohort companies that have received VC fundingcohorts.
310/1020/2017 122016 4:00-26:00 pm: Put a write-up on *Met with Peter and Christy to discuss the top possibility of the Accelerator wiki page detailing where we are in the project currently as well as what creating a web crawler that will pull data we have accumulated on the RDPfrom individual accelerator sites.
310/2024/2017 12016 2:00-5:00 pm: Began gathering *Brainstormed with Albert and Julia about changes to the URLs category name for SBDE. Spoke to Ed about full scope of all accelerators in a TextPad file called Accelerator URLs. Participated in the SQL training sessionaccelerator project.
310/2225/2017 12016 4:00-56:00 pm: Made tables in Terminal for Accelerator companies matched *Brainstormed with VC companies and Shrey about different potential industry focuses within accelerators, as well as different variables to search for Cohort Datain terms of accelerators, startups, cohorts, etc.
310/2726/2017 12016 2:00-45:00 pm: Compiled all URLs *Began searching for more databases including lists of accelerators as well as some characteristics of accelerator into a TextPad file.those accelerators; Began searching for characteristics that identify accelerators on their websites
310/2927/2017 12016 4:00-56:00 pm: Worked *Continued searching for relevant lists of accelerators to include on the matched data with Benour page. Next time I will run the RegEx code Added some links that will filter have high potential under the URLs, and I will look through the duplicates where two different VC backed company names matched to one cohort company nametab (Obtained from List of Accelerators or various Google searches).
310/31/2017 12016 2:00-25:00 pm: Ran the code for *Began constructing a list of variables that clearly distinguish an accelerator urls which are ready on its website. This is in an effort to be run through the wayback machine in order allow a crawler to get the start dates. Also began looking crawl through vc backed company namesmany Google searches and identify accelerators.
411/31/2017 12016 4:00-56:00 pm: *Continued looking for variables that could identify accelerators from their websites. Searched through double matched VC companies. Learned more SQL numerous different websites of accelerators obtained from Edour current databases.
411/52/2017 12016 2:00-54:00 pm: Made *Continued combing through websites of numerous accelerators, well-known and other, in the final vc percentage table on terminal and for next time I will collect missing accelerator datahopes of finding identifying variables.
411/73/2017 12016 4:00-36:00 pm: Began collecting cohort data for big *Finalized my list of variables that could be used to distinguish the websites of accelerators that were missing from . Slightly re-arranged our list of accelerator databases in order to add it to our final list of cohort companiesrelevance.
411/107/2017 12016 2:00-5:00 pm: Finished gathering cohort company names for big *Began compiling the list of all accelerators that we were missing and put them into the Cleaned Cohort Companies Excel file. Ben is looking through Crunchbase data in order to possibly find more missing acceleratorsCreated a new TextPad document with information from a new database.
411/148/2017 12016 4:00-46:00 pm: Began working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators *Worked with Shrey and wrote notes Ben in order to compile all of our accelerator databases into one long list on the ones that I was able to go through. Need to finish this textpad before moving forwardTextpad.
411/179/2017 12016 2:00-45:00 pm: *Continued going through potential Crunchbase formulating a database for all accelerators that we may have missed. Talked to Ed about getting a more comprehensive list from Excel file and by the end all of the semester have the tables and data collected and doneavailable info given.
411/1910/2017 12016 4:00-46:00 pm: *Worked with Jeemin Shrey and Peter in order to generate an entire list of potential US accelerators from crunchbase. Worked to find develop a way to classify accelerators just based on their descriptionscrawler for f6s.
411/2114/2017: 12016 2:00-45:00 pm: Continued working through *Began sorting the list identifying accelerators that we do not have. Ramee and Juliette are now helping us gather cohort data for those missing acceleratorsSeed-DB database in an Excel document.
411/2415/2017 92016 4:00-16:00 pm: Updated Veeral on current state of project*Conducted some Google searches in an attempt to find more accelerator databases. Typed up a to-do list on the discussion wiki Began looking through Executive Orders searching for Veeral. Got new cohort data on an accelerator and added it to Excel filekeywords.
11/16/2016 2:00-5:00 pm*Completed searching through Executive Orders. 11/317/2017 2016 4:00-6:00 pm*Continued working on Google searches for state accelerator list. Looked through f6s for common words that can be used to distinguish accelerators once we have finalized the crawler. 11/21/2016 2:00-15:00 pm*Randomly chose 10 accelerators from Excel list of accelerators on the RDP. Went through each website and listed the steps that I took in order to determine whether or not the website belonged to an accelerator. Will continue extracting cohort information tomorrow. 11/22/2016 4:00-6: Talked 00 pm*Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. Worked with Peter in order to Ed build a tool that will search all of the HTMLs and Anne about future reportattempt to identify each one as an accelerator as well as extract some basic information. Continued working through  11/28/2016 2:00-5:00 pm*Merged the F6S accelerator list with our other list , then posted it on the project page. Learned process for accelerator data extraction from Ed. 11/29/2016 4:00-6:00 pm*Began process of crunchbase potential collecting data from the 20 acceleratorsthat I am responsible for. Last day  11/30/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished 15/20. 12/1/2016 4:00-6:00 pm*Continued collecting data from accelerators. Finished original 20, picked up a new set of work 20. 12/2/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished next 20. 12/8/2016 1:00-3:00 pm*Completed collecting data from accelerators for this the semester.
[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]]
[[Category:Work Log]]

Navigation menu