Changes

Jump to navigation Jump to search
no edit summary
10/17/2016 2:00-5:00 pm: Created personal wiki page as well as work log; Read about the research project to which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links===Fall 2017===<onlyinclude>
10[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]] 9/1811/2016 42017 2:00-65:00 pm*Spoke to Ed about the project going forward. Organized the current updated data for our project. 9/12/2017 3: Met with research partner Shrey who filled me in on where we are 00-5:00 pm*Began going through the Cleaned Cohort Data Excel file and found a few problems with it. Will continue the cleaning process for the rest of the week. 9/13/2017 2:00-5:00 pm*Sorted through Cleaned Cohort Data and finalized our List of Accelerators. We can begin the project; Began looking on websites process of creating our PercentVC table. 9/14/2017 3:00-5:00 pm*Completely finalized our dataset of certain accelerators and startups. Met with Michelle Passo to discuss objectives of the research for how to determine their cohorts and listed these steps on credit course. 9/18/2017 2:00-4:00 pm*Talked with Peter about the wikiLinkedIn crawler data. Went through VC page that Meghana sent me.
109/19/2016 22017 3:00-5:00 pm: Finished looking on the remaining accelerator websites and wrote the steps on determining how to manually locate the cohorts*Completed SDC pull of updated VC Data.
109/20/2016 42017 2:00-65:00 pm: Met with Peter and Christy *Attempted several times to discuss run the possibility of creating a web crawler that will pull Matcher. Cleaned our pulled data from individual accelerator sites.
109/2421/2016 22017 3:00-5:00 pm: Brainstormed with Albert and Julia about changes *Came extremely close to running the Matcher the category name for SBDEcorrectly. Spoke to Ed about full scope of accelerator projectReviewed the final LinkedIn data from Peter.
109/25/2016 42017 2:00-65:00 pm: Brainstormed *Finalized the matched file of accelerator companies with Shrey about different potential industry focuses within accelerators, as well as different variables to search for in terms of VC portfolio companies. Gave Ben the data on Georgia accelerators, startups, cohorts, etc.
109/26/2016 22017 3:00-5:00 pm: Began searching for more databases including lists of accelerators as well as some characteristics of those accelerators; Began searching for characteristics that identify accelerators *Worked on their websitesfinding the duplicates in our Matched file in order to have the most accurate data.
109/27/2016 42017 2:00-65:00 pm: Continued searching for relevant lists of accelerators *Attempted to find a way to include on our page. Added some links that have high potential under organize the tab (Obtained from List of Accelerators or various Google searches)duplicate matches.
109/3128/2016 22017 4:00-5:00 pm: Began constructing a list of variables that clearly distinguish an accelerator on its website. This is *Continued running through matched data in an effort to allow a crawler order to crawl through many Google searches and identify acceleratorsorganize it effectively.
1110/12/2016 42017 2:00-65:00 pm: Continued looking *Talked to Ed about next steps for variables that could identify accelerators from their websitesthe project. Practiced accessing the crunchbase database on SQL. Searched through numerous different websites of accelerators obtained from our current databasesBrushed up on SQL code.
1110/23/2016 22017 3:00-45:00 pm: Continued combing through websites of numerous accelerators, well-known and other, in *Searched the hopes of finding identifying variablesdatabase for crunchbase investment information.
1110/34/2016 42017 2:00-65:00 pm: Finalized my list of variables *Pulled the funding rounds table from SQL and matched it with the companies that could be used to distinguish the websites of accelerators. Slightly re-arranged our list of accelerator databases have received VC funding in order of relevanceto gather round dates.
1110/76/2016 22017 3:00-5:00 pm: Began compiling *Went through the matched data. Brainstormed ways to get the list of all dates for cohort companies going through accelerators. Created a new TextPad document with information from a new database.
10/11/8/2016 42017 2:00-63:00 30 pm: Worked with Shrey and Ben *Looked into using the WhoIs Parser in order to compile all of our accelerator databases into one long list on Textpadfind when the companies went through their accelerators.
1110/912/2016 22017 3:00-5:00 pm: Continued formulating *Discovered that the Wayback Machine will not be a good option for finding when companies went through their accelerators. Created a list of VCCompanies and their earliest round date. Included a database column for all the date they went through their accelerators and all will fill it in when we find a good method of the available info givenfinding this date.
1110/1016/2016 42017 2:00-63:00 30 pm: Worked with Shrey and Peter in order to develop a crawler for f6s*Continued working on sorting VCCompanies by their earliest round date.
1110/1417/2016 22017 3:00-5:00 pm: Began sorting the Seed-DB database in an Excel document*Worked with Ben to find a solution to our problem of data acquisition. Finalized earliest round date for VCCompanies.
1110/1518/2016 42017 2:00-65:00 pm: Conducted some Google searches *Updated our VC data with Ed's help in an attempt order to find more accelerator databases. Began looking through Executive Orders searching for keywordsincrease the accuracy and completion of our data.
1110/1619/2016 22017 3:00-5:00 pm: Completed searching through Executive Orders*Organized all of our matched data and updated it in order to reflect the most recent SDC pull with Ed. Matched Crunchbase data with our cohort companies.
1110/1720/2016 42017 2:00-63:00 30 pm: Continued working on Google searches for state accelerator *Generated the new list. Looked through f6s for common words that can be used to distinguish accelerators once we have finalized the crawlerof VCCompanies as well as their earliest round dates.
1110/2123/2016 2017 2:00-53:00 30 pm: Randomly chose 10 accelerators from Excel list of accelerators *Worked on sorting out the RDP. Went through each website and listed the steps that I took discrepancies in order to determine whether or not the website belonged to an accelerator. Will continue extracting cohort information tomorrowour matched data.
1110/2224/2016 42017 3:00-65:00 pm: Listed out all steps for extracting cohort information from the ten randomly chosen *Went through list of VCCompanies and began adding respective accelerators. Worked with Peter in order to build a tool that will search all of the HTMLs and attempt to identify each one as an accelerator as well as extract some basic informationproceed with VCPercentage table.
1110/2825/2016 2017 2:00-5:00 pm: Merged the F6S accelerator list with our other *Continued going through list, then posted it on the project page. Learned process for accelerator data extraction from Edof VCCompanies and adding accelerators.
1110/2926/2016 42017 3:0030-65:00 30 pm: Began process *Continued going through list of collecting data from the 20 VCCompanies and adding accelerators that I am responsible for. Will have this completed on Monday.
1110/30/2016 2017 2:00-53:00 30 pm: Continued collecting data from *Finished adding all of the acceleratorsto the list of VCCompanies. Finished 15/20Added a column indicating whether or not the company went through two or more accelerators.
1210/131/2016 42017 3:00-65:00 pm: Continued collecting *Began compiling data from accelerators. Finished original 20, picked up a new set of 20in the column for Date Company went through Accelerator.
1211/21/2016 2017 2:00-54:00 pm: Continued collecting data from accelerators. Finished next 20*Finalized entering dates for Y Combinator cohort companies.
1211/82/2016 12017 4:00-35:00 30 pm: Completed collecting data from accelerators for the semester*Continued entering cohort company dates into Excel file.
111/186/2017 12:00-54:00 pm: *Continued collecting data for accelerator projectentering cohort company dates into Excel file. Helped Catherine draft tweets Began compiling a list of keywords for the McNair Center twitter accountdemo day press releases.
111/207/2017 13:00-35:00 pm: Continued collecting data on accelerators*Finished coming up with keywords for demo day crawler. Attended McNair Center team meetingSent the final list to Peter.
111/238/2017 12:00-53:00 30 pm: Began combing through accelerator list, determining which accelerators are still missing *Spoke to Ed and organized all of our current data and documenting these in a TextPad file. Finished through #115.
111/259/2017 13:00-5:00 pm: Continued looking through accelerator list*Created a new project page called Accelerator Data and listed all relevant files as well as descriptions.
111/2714/2017 13:00-35:00 pm: Continued going through accelerator list. Left off on #226 with Shrey*Looked up URLs and decided whether or not the webiste was relevant.
111/2015/2017 12:00-5:00 pm: Continued going through accelerator list. Finished through #440*Created SQL database entitled "acceleratordata" and began creating tables from folder of All Relevant Files.
211/116/2017 13:00-5:00 pm: Finished going through the list of accelerators looking for incomplete files. Began completing the files that were not done*Continued to input tables into SQL database.
211/320/2017 12:00-35:00 pm: Continued working on completing accelerator *Cleaned text filesin order to import tables into SQL database.
211/627/2017 12:00-45:30 00 pm: Finished data set of accelerators*Worked with Peter to find and exclude irrelevant keywords on HTML pages. Began going through and making sure that all text files and cohort files are of the same format so Peter can easily pull the information. Left for 30 minutes for an interview from 2:30-3:00 pmcategorizing relevant demo day pages.
211/828/2017 13:00-5:00 pm: *Finished formatting through #137. Spoke with Ed about projectinputting tables of relevant files into SQL database.
211/1329/2017 12:00-5:00 pm: Completed formatting for all *Went through accelerator text filesHTML URLs. Spoke with Ed about going through HTMLs and classifying based on overall and specific relevance.
212/151/2017 3:00-5:00 pm: Made copy of the completed data set. Spoke to Ed *Worked through accelerator links and classified pages based on whether or not they provided relevant information about future steps to take for project including gathering founder data and obtaining the crunchbase apistartup timing.
212/174/2017 110:00-312:00 pm: Went *Continued running through final Excel spreadsheet for cohort information. Still need to run the crawler one more time after the completion of the editing process. Found the application for the crunchbase api which will hopefully allow us to gain accessdemo day crawl URLs and scoring them based on relevance.
212/207/2017 1:00-54:00 30 pm: Filled out another application *Finalized scoring of demo day URLs for Crunchbase research access; Found the first source original crawl. Last day of work for the incubator project on angelthis semester.co, will hopefully work with Peter to make a crawler similar to f6s
2</22/2017 1:00-5:00 pm: Pulled data from SDC for Ed and normalized it. Learned how to use SDC and the normalizer.onlyinclude>
2/24/===Spring 2017 1:00-3:00 pm: Finished cleaning up the cohort data for Y-combinator on the Final Cohort Excel Spreadsheet.===
21/2718/2017 1:00-5:00 pm: *Continued cleaning up the cohort collecting data in for accelerator project. Helped Catherine draft tweets for the Excel file. Finished Cohort Number and YearMcNair Center twitter account.
31/120/2017 21:00-53:00 pm: Worked with Ben and Shrey to pull *Continued collecting data from SDC for all VC funded companies and normalized it to put it in an Excel documenton accelerators. Attended McNair Center team meeting.
31/323/2017 1:00-25:30 00 pm: Worked with Ben to try *Began combing through accelerator list, determining which accelerators are still missing data and repeat down the VC data without it going too fardocumenting these in a TextPad file. Finished through #115.
31/625/2017 1:00-45:00 pm: Worked with Shrey to finish cleaning the cohort data. It is ready to be run *Continued looking through the matcher with Benaccelerator list.
31/827/2017 1:00-53:00 pm: Matched the VC Data *Continued going through accelerator list. Left off on #226 with the list of Cohort Companies and got one list of all cohort companies that have received VC fundingShrey.
31/1020/2017 121:00-25:00 pm: Put a write-up on the top of the Accelerator wiki page detailing where we are in the project currently as well as what data we have accumulated on the RDP*Continued going through accelerator list. Finished through #440.
32/201/2017 1:00-5:00 pm: Began gathering *Finished going through the URLs list of all accelerators in a TextPad file called Accelerator URLslooking for incomplete files. Participated in Began completing the SQL training sessionfiles that were not done.
2/3/22/2017 1:00-53:00 pm: Made tables in Terminal for Accelerator companies matched with VC companies and for Cohort Data*Continued working on completing accelerator files.
32/276/2017 1:00-4:00 30 pm: Compiled *Finished data set of accelerators. Began going through and making sure that all URLs text files and cohort files are of accelerator into a TextPad filethe same format so Peter can easily pull the information. Left for 30 minutes for an interview from 2:30-3:00 pm.
32/298/2017 1:00-5:00 pm: Worked on the matched data *Finished formatting through #137. Spoke with Ben. Next time I will run the RegEx code that will filter the URLs, and I will look through the duplicates where two different VC backed company names matched to one cohort company nameEd about project.
32/3113/2017 1:00-25:00 pm: Ran the code *Completed formatting for all accelerator urls which are ready to be run through the wayback machine in order to get the start dates. Also began looking through vc backed company namestext files.
42/315/2017 13:00-5:00 pm: Continued looking through double matched VC companies*Made copy of the completed data set. Learned more SQL from Spoke to Edabout future steps to take for project including gathering founder data and obtaining the crunchbase api.
42/517/2017 1:00-53:00 pm: Made the *Went through final vc percentage table on terminal and Excel spreadsheet for next cohort information. Still need to run the crawler one more time I after the completion of the editing process. Found the application for the crunchbase api which will collect missing accelerator datahopefully allow us to gain access.
42/720/2017 1:00-35:00 pm: Began collecting cohort data *Filled out another application for Crunchbase research access; Found the first source for big accelerators that were missing from our list in order the incubator project on angel.co, will hopefully work with Peter to add it make a crawler similar to our final list of cohort companies.f6s
42/1022/2017 1:00-5:00 pm: Finished gathering cohort company names *Pulled data from SDC for big accelerators that we were missing Ed and put them into the Cleaned Cohort Companies Excel filenormalized it. Ben is looking through Crunchbase data in order Learned how to possibly find more missing acceleratorsuse SDC and the normalizer.
42/1424/2017 1:00-43:00 pm: Began working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes *Finished cleaning up the cohort data for Y-combinator on the ones that I was able to go through. Need to finish this textpad before moving forwardFinal Cohort Excel Spreadsheet.
42/1727/2017 1:00-45:00 pm: *Continued going through potential Crunchbase accelerators that we may have missed. Talked to Ed about getting a more comprehensive list from cleaning up the cohort data in the Excel file . Finished Cohort Number and by the end of the semester have the tables and data collected and doneYear.
43/191/2017 12:00-45:00 pm: *Worked with Jeemin Ben and Shrey to generate an entire list of potential US accelerators pull data from crunchbase. Worked SDC for all VC funded companies and normalized it to find a way to classify accelerators just based on their descriptionsput it in an Excel document.
43/213/2017: 1:00-42:00 30 pm: Continued working through *Worked with Ben to try and repeat down the list identifying accelerators that we do not have. Ramee and Juliette are now helping us gather cohort VC data for those missing acceleratorswithout it going too far.
43/246/2017 91:00-14:00 pm: Updated Veeral on current state of project. Typed up a *Worked with Shrey to-do list on finish cleaning the discussion wiki for Veeral. Got new cohort data on an accelerator and added it . It is ready to Excel filebe run through the matcher with Ben.
53/38/2017 111:00-15:00 pm: Talked to Ed *Matched the VC Data with the list of Cohort Companies and Anne about future report. Continued working through got one list of crunchbase potential accelerators. Last day of work for this semesterall cohort companies that have received VC funding.
93/1110/2017 212:00-52:00 pm: Spoke to Ed about *Put a write-up on the top of the Accelerator wiki page detailing where we are in the project going forward. Organized currently as well as what data we have accumulated on the current updated data for our projectRDP.
93/1220/2017 31:00-5:00 pm: *Began going through gathering the Cleaned Cohort Data Excel URLs of all accelerators in a TextPad file and found a few problems with itcalled Accelerator URLs. Will continue the cleaning process for the rest of Participated in the weekSQL training session.
93/1322/2017 21:00-5:00 pm: Sorted through Cleaned *Made tables in Terminal for Accelerator companies matched with VC companies and for Cohort Data and finalized our List of Accelerators. We can begin the process of creating our PercentVC table.
93/1427/2017 31:00-54:00 pm: Completely finalized our dataset of accelerators and startups. Met with Michelle Passo to discuss objectives *Compiled all URLs of the research for credit courseaccelerator into a TextPad file.
93/1829/2017 21:00-45:00 pm: Talked with Peter about *Worked on the LinkedIn crawler matched datawith Ben. Went Next time I will run the RegEx code that will filter the URLs, and I will look through the duplicates where two different VC page that Meghana sent mebacked company names matched to one cohort company name.
93/1931/2017 31:00-52:00 pm: Completed SDC pull of updated VC Data*Ran the code for accelerator urls which are ready to be run through the wayback machine in order to get the start dates. Also began looking through vc backed company names.
94/203/2017 21:00-5:00 pm: Attempted several times to run the Matcher*Continued looking through double matched VC companies. Cleaned our pulled dataLearned more SQL from Ed.
94/215/2017 31:00-5:00 pm: Came extremely close to running the Matcher the correctly. Reviewed *Made the final LinkedIn vc percentage table on terminal and for next time I will collect missing accelerator data from Peter.
94/257/2017 21:00-53:00 pm: Finalized the matched file *Began collecting cohort data for big accelerators that were missing from our list in order to add it to our final list of accelerator companies with VC portfolio cohort companies. Gave Ben the data on Georgia accelerators.
94/2610/2017 31:00-5:00 pm: Worked on finding *Finished gathering cohort company names for big accelerators that we were missing and put them into the duplicates in our Matched Cleaned Cohort Companies Excel file . Ben is looking through Crunchbase data in order to have the most accurate datapossibly find more missing accelerators.
94/2714/2017 21:00-54:00 pm: Attempted *Began working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes on the ones that I was able to find a way go through. Need to organize the duplicate matchesfinish this textpad before moving forward.
94/2817/2017 41:00-54:00 pm: *Continued running going through matched potential Crunchbase accelerators that we may have missed. Talked to Ed about getting a more comprehensive list from Excel file and by the end of the semester have the tables and data in order to organize it effectivelycollected and done.
104/219/2017 21:00-54:00 pm: Talked *Worked with Jeemin to Ed about next steps for the project. Practiced accessing the generate an entire list of potential US accelerators from crunchbase database on SQL. Brushed up Worked to find a way to classify accelerators just based on SQL codetheir descriptions.
104/321/2017 3: 1:00-54:00 pm: Searched *Continued working through the database list identifying accelerators that we do not have. Ramee and Juliette are now helping us gather cohort data for crunchbase investment informationthose missing accelerators.
4/24/2017 9:00-1:00 pm*Updated Veeral on current state of project. Typed up a to-do list on the discussion wiki for Veeral. Got new cohort data on an accelerator and added it to Excel file. 5/3/2017 11:00-1:00 pm*Talked to Ed and Anne about future report. Continued working through list of crunchbase potential accelerators. Last day of work for this semester. ===Fall 2016=== 10/17/2016 2:00-5:00 pm*Created personal wiki page as well as work log; Read about the research project to which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links 10/18/2016 4:00-6:00 pm*Met with research partner Shrey who filled me in on where we are with the project; Began looking on websites of certain accelerators for how to determine their cohorts and listed these steps on the wiki 10/19/2016 2:00-5:00 pm*Finished looking on the remaining accelerator websites and wrote the steps on determining how to manually locate the cohorts. 10/20/2016 4:00-6:00 pm*Met with Peter and Christy to discuss the possibility of creating a web crawler that will pull data from individual accelerator sites. 10/24/2017 2016 2:00-5:00 pm*Brainstormed with Albert and Julia about changes to the category name for SBDE. Spoke to Ed about full scope of accelerator project. 10/25/2016 4:00-6:00 pm*Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables to search for in terms of accelerators, startups, cohorts, etc. 10/26/2016 2:00-5:00 pm*Began searching for more databases including lists of accelerators as well as some characteristics of those accelerators; Began searching for characteristics that identify accelerators on their websites 10/27/2016 4:00-6:00 pm*Continued searching for relevant lists of accelerators to include on our page. Added some links that have high potential under the tab (Obtained from List of Accelerators or various Google searches). 10/31/2016 2:00-5:00 pm*Began constructing a list of variables that clearly distinguish an accelerator on its website. This is in an effort to allow a crawler to crawl through many Google searches and identify accelerators. 11/1/2016 4:00-6:00 pm*Continued looking for variables that could identify accelerators from their websites. Searched through numerous different websites of accelerators obtained from our current databases. 11/2/2016 2:00-4:00 pm*Continued combing through websites of numerous accelerators, well-known and other, in the hopes of finding identifying variables. 11/3/2016 4:00-6:00 pm*Finalized my list of variables that could be used to distinguish the websites of accelerators. Slightly re-arranged our list of accelerator databases in order of relevance. 11/7/2016 2:00-5: Pulled 00 pm*Began compiling the funding rounds table list of all accelerators. Created a new TextPad document with information from SQL a new database. 11/8/2016 4:00-6:00 pm*Worked with Shrey and Ben in order to compile all of our accelerator databases into one long list on Textpad. 11/9/2016 2:00-5:00 pm*Continued formulating a database for all accelerators and matched it all of the available info given. 11/10/2016 4:00-6:00 pm*Worked with Shrey and Peter in order to develop a crawler for f6s. 11/14/2016 2:00-5:00 pm*Began sorting the companies Seed-DB database in an Excel document. 11/15/2016 4:00-6:00 pm*Conducted some Google searches in an attempt to find more accelerator databases. Began looking through Executive Orders searching for keywords. 11/16/2016 2:00-5:00 pm*Completed searching through Executive Orders. 11/17/2016 4:00-6:00 pm*Continued working on Google searches for state accelerator list. Looked through f6s for common words that can be used to distinguish accelerators once we have received VC funding finalized the crawler. 11/21/2016 2:00-5:00 pm*Randomly chose 10 accelerators from Excel list of accelerators on the RDP. Went through each website and listed the steps that I took in order to gather round datesdetermine whether or not the website belonged to an accelerator. Will continue extracting cohort information tomorrow. 11/22/2016 4:00-6:00 pm*Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. Worked with Peter in order to build a tool that will search all of the HTMLs and attempt to identify each one as an accelerator as well as extract some basic information. 11/28/2016 2:00-5:00 pm*Merged the F6S accelerator list with our other list, then posted it on the project page. Learned process for accelerator data extraction from Ed. 11/29/2016 4:00-6:00 pm*Began process of collecting data from the 20 accelerators that I am responsible for. 11/30/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished 15/20. 12/1/2016 4:00-6:00 pm*Continued collecting data from accelerators. Finished original 20, picked up a new set of 20. 12/2/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished next 20. 12/8/2016 1:00-3:00 pm*Completed collecting data from accelerators for the semester.
[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]]
[[Category:Work Log]]

Navigation menu