Changes

Jump to navigation Jump to search
no edit summary
10/17/2016 2:00-5:00 pm: Created personal wiki page as well as work log; Read about the research project to which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links===Fall 2017===<onlyinclude>
10[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]] 9/1811/2016 42017 2:00-65:00 pm*Spoke to Ed about the project going forward. Organized the current updated data for our project. 9/12/2017 3:00-5: 00 pm*Began going through the Cleaned Cohort Data Excel file and found a few problems with it. Will continue the cleaning process for the rest of the week. 9/13/2017 2:00-5:00 pm*Sorted through Cleaned Cohort Data and finalized our List of Accelerators. We can begin the process of creating our PercentVC table. 9/14/2017 3:00-5:00 pm*Completely finalized our dataset of accelerators and startups. Met with Michelle Passo to discuss objectives of the research partner Shrey who filled for credit course. 9/18/2017 2:00-4:00 pm*Talked with Peter about the LinkedIn crawler data. Went through VC page that Meghana sent me . 9/19/2017 3:00-5:00 pm*Completed SDC pull of updated VC Data. 9/20/2017 2:00-5:00 pm*Attempted several times to run the Matcher. Cleaned our pulled data. 9/21/2017 3:00-5:00 pm*Came extremely close to running the Matcher the correctly. Reviewed the final LinkedIn data from Peter. 9/25/2017 2:00-5:00 pm*Finalized the matched file of accelerator companies with VC portfolio companies. Gave Ben the data on Georgia accelerators. 9/26/2017 3:00-5:00 pm*Worked on finding the duplicates in our Matched file in order to have the most accurate data. 9/27/2017 2:00-5:00 pm*Attempted to find a way to organize the duplicate matches. 9/28/2017 4:00-5:00 pm*Continued running through matched data in order to organize it effectively. 10/2/2017 2:00-5:00 pm*Talked to Ed about next steps for the project. Practiced accessing the crunchbase database on where we are SQL. Brushed up on SQL code. 10/3/2017 3:00-5:00 pm*Searched the database for crunchbase investment information. 10/4/2017 2:00-5:00 pm*Pulled the funding rounds table from SQL and matched it with the project; Began looking companies that have received VC funding in order to gather round dates. 10/6/2017 3:00-5:00 pm*Went through the matched data. Brainstormed ways to get the dates for cohort companies going through accelerators. 10/11/2017 2:00-3:30 pm:*Looked into using the WhoIs Parser in order to find when the companies went through their accelerators. 10/12/2017 3:00-5:00 pm*Discovered that the Wayback Machine will not be a good option for finding when companies went through their accelerators. Created a list of VCCompanies and their earliest round date. Included a column for the date they went through their accelerators and will fill it in when we find a good method of finding this date. 10/16/2017 2:00-3:30 pm*Continued working on websites sorting VCCompanies by their earliest round date. 10/17/2017 3:00-5:00 pm*Worked with Ben to find a solution to our problem of certain accelerators data acquisition. Finalized earliest round date for how VCCompanies. 10/18/2017 2:00-5:00 pm*Updated our VC data with Ed's help in order to determine their cohorts increase the accuracy and completion of our data. 10/19/2017 3:00-5:00 pm*Organized all of our matched data and listed these steps on updated it in order to reflect the most recent SDC pull with Ed. Matched Crunchbase data with our cohort companies. 10/20/2017 2:00-3:30 pm*Generated the wikinew list of VCCompanies as well as their earliest round dates.
10/1923/2016 2017 2:00-53:00 30 pm: Finished looking on the remaining accelerator websites and wrote the steps *Worked on determining how to manually locate sorting out the cohortsdiscrepancies in our matched data.
10/2024/2016 42017 3:00-65:00 pm: Met with Peter *Went through list of VCCompanies and Christy began adding respective accelerators in order to discuss the possibility of creating a web crawler that will pull data from individual accelerator sitesproceed with VCPercentage table.
10/2425/2016 2017 2:00-5:00 pm: Brainstormed with Albert *Continued going through list of VCCompanies and Julia about changes to the category name for SBDE. Spoke to Ed about full scope of accelerator projectadding accelerators.
10/2526/2016 42017 3:0030-65:00 30 pm: Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables to search for in terms *Continued going through list of VCCompanies and adding accelerators, startups, cohorts, etc. Will have this completed on Monday.
10/2630/2016 2017 2:00-53:00 30 pm: Began searching for more databases including lists *Finished adding all of the accelerators as well as some characteristics to the list of those accelerators; Began searching for characteristics that identify VCCompanies. Added a column indicating whether or not the company went through two or more accelerators on their websites.
10/2731/2016 42017 3:00-65:00 pm: Continued searching *Began compiling data in the column for relevant lists of accelerators to include on our page. Added some links that have high potential under the tab (Obtained from List of Accelerators or various Google searches)Date Company went through Accelerator.
1011/311/2016 2017 2:00-54:00 pm: Began constructing a list of variables that clearly distinguish an accelerator on its website. This is in an effort to allow a crawler to crawl through many Google searches and identify accelerators*Finalized entering dates for Y Combinator cohort companies.
11/12/2016 2017 4:00-65:00 30 pm: *Continued looking for variables that could identify accelerators from their websites. Searched through numerous different websites of accelerators obtained from our current databasesentering cohort company dates into Excel file.
11/26/2016 2017 2:00-4:00 pm: *Continued combing through websites of numerous accelerators, well-known and other, in the hopes entering cohort company dates into Excel file. Began compiling a list of finding identifying variableskeywords for demo day press releases.
11/7/2017 3/2016 4:00-65:00 pm: Finalized my *Finished coming up with keywords for demo day crawler. Sent the final list of variables that could be used to distinguish the websites of accelerators. Slightly re-arranged our list of accelerator databases in order of relevancePeter.
11/78/2016 2017 2:00-53:00 30 pm: Began compiling the list *Spoke to Ed and organized all of all accelerators. Created a new TextPad document with information from a new databaseour current data.
11/89/2016 42017 3:00-65:00 pm: Worked with Shrey *Created a new project page called Accelerator Data and Ben in order to compile listed all of our accelerator databases into one long list on Textpadrelevant files as well as descriptions.
11/914/2016 22017 3:00-5:00 pm: Continued formulating a database for all accelerators *Looked up URLs and all of decided whether or not the available info givenwebiste was relevant.
11/1015/2016 42017 2:00-65:00 pm: Worked with Shrey *Created SQL database entitled "acceleratordata" and Peter in order to develop a crawler for f6sbegan creating tables from folder of All Relevant Files.
11/1416/2016 22017 3:00-5:00 pm: Began sorting the Seed-DB *Continued to input tables into SQL database in an Excel document.
11/1520/2016 42017 2:00-65:00 pm: Conducted some Google searches *Cleaned text files in an attempt order to find more accelerator databases. Began looking through Executive Orders searching for keywordsimport tables into SQL database.
11/1627/2016 2017 2:00-5:00 pm: Completed searching through Executive Orders*Worked with Peter to find and exclude irrelevant keywords on HTML pages. Began categorizing relevant demo day pages.
11/1728/2016 42017 3:00-65:00 pm: Continued working on Google searches for state accelerator list. Looked through f6s for common words that can be used to distinguish accelerators once we have finalized the crawler*Finished inputting tables of relevant files into SQL database.
11/2129/2016 2017 2:00-5:00 pm: Randomly chose 10 accelerators from Excel list of accelerators on the RDP. *Went through each website and listed the steps that I took in order to determine whether or not the website belonged to an acceleratorHTML URLs. Will continue extracting cohort information tomorrowSpoke with Ed about going through HTMLs and classifying based on overall and specific relevance.
1112/221/2016 42017 3:00-65:00 pm: Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. *Worked with Peter in order to build a tool that will search all of the HTMLs through accelerator links and attempt to identify each one as an accelerator as well as extract some basic classified pages based on whether or not they provided relevant informationabout startup timing.
1112/284/2016 22017 10:00-512:00 pm: Merged the F6S accelerator list with our other list, then posted it *Continued running through demo day crawl URLs and scoring them based on the project page. Learned process for accelerator data extraction from Edrelevance.
1112/297/2016 42017 1:00-64:00 30 pm: Began process *Finalized scoring of collecting data from demo day URLs for the 20 accelerators that I am responsible original crawl. Last day of work forthis semester.
11</30/2016 2:00-5:00 pm: Continued collecting data from accelerators. Finished 15/20.onlyinclude>
12/1/2016 4:00-6:00 pm: Continued collecting data from accelerators. Finished original 20, picked up a new set of 20.===Spring 2017===
121/218/2016 22017 1:00-5:00 pm: *Continued collecting data from acceleratorsfor accelerator project. Finished next 20Helped Catherine draft tweets for the McNair Center twitter account.
121/820/2016 2017 1:00-3:00 pm: Completed *Continued collecting data from on accelerators for the semester. Attended McNair Center team meeting.
1/1823/2017 1:00-5:00 pm: Continued collecting *Began combing through accelerator list, determining which accelerators are still missing data for accelerator projectand documenting these in a TextPad file. Helped Catherine draft tweets for the McNair Center twitter accountFinished through #115.
1/2025/2017 1:00-35:00 pm: *Continued collecting data on accelerators. Attended McNair Center team meetinglooking through accelerator list.
1/2327/2017 1:00-53:00 pm: Began combing *Continued going through accelerator list, determining which accelerators are still missing data and documenting these in a TextPad file. Finished through Left off on #115226 with Shrey.
1/2520/2017 1:00-5:00 pm: *Continued looking going through accelerator list. Finished through #440.
2/1/27/2017 1:00-35:00 pm: Continued *Finished going through accelerator the listof accelerators looking for incomplete files. Left off on #226 with ShreyBegan completing the files that were not done.
12/203/2017 1:00-53:00 pm: *Continued going through working on completing accelerator list. Finished through #440files.
2/16/2017 1:00-54:00 30 pm: *Finished going through the list data set of accelerators looking for incomplete files. Began completing going through and making sure that all text files and cohort files are of the files that were not donesame format so Peter can easily pull the information. Left for 30 minutes for an interview from 2:30-3:00 pm.
2/38/2017 1:00-35:00 pm: Continued working on completing accelerator files*Finished formatting through #137. Spoke with Ed about project.
2/613/2017 1:00-45:30 00 pm: Finished data set of accelerators. Began going through and making sure that *Completed formatting for all accelerator text files and cohort files are of the same format so Peter can easily pull the information. Left for 30 minutes for an interview from 2:30-3:00 pm.
2/815/2017 13:00-5:00 pm: Finished formatting through #137*Made copy of the completed data set. Spoke with to Ed about future steps to take for projectincluding gathering founder data and obtaining the crunchbase api.
2/1317/2017 1:00-53:00 pm: Completed formatting *Went through final Excel spreadsheet for cohort information. Still need to run the crawler one more time after the completion of the editing process. Found the application for all accelerator text filesthe crunchbase api which will hopefully allow us to gain access.
2/1520/2017 31:00-5:00 pm: Made copy of *Filled out another application for Crunchbase research access; Found the first source for the completed data setincubator project on angel. Spoke co, will hopefully work with Peter to Ed about future steps make a crawler similar to take for project including gathering founder data and obtaining the crunchbase api.f6s
2/1722/2017 1:00-35:00 pm: Went through final Excel spreadsheet *Pulled data from SDC for cohort informationEd and normalized it. Still need Learned how to run the crawler one more time after the completion of the editing process. Found the application for use SDC and the crunchbase api which will hopefully allow us to gain accessnormalizer.
2/2024/2017 1:00-53:00 pm: Filled out another application for Crunchbase research access; Found *Finished cleaning up the first source cohort data for Y-combinator on the incubator project on angelFinal Cohort Excel Spreadsheet.co, will hopefully work with Peter to make a crawler similar to f6s
2/2227/2017 1:00-5:00 pm: Pulled *Continued cleaning up the cohort data from SDC for Ed and normalized itin the Excel file. Learned how to use SDC Finished Cohort Number and the normalizerYear.
23/241/2017 12:00-35:00 pm: Finished cleaning up the cohort *Worked with Ben and Shrey to pull data from SDC for Y-combinator on the Final Cohort all VC funded companies and normalized it to put it in an Excel Spreadsheetdocument.
23/273/2017 1:00-52:00 30 pm: Continued cleaning up *Worked with Ben to try and repeat down the cohort VC data in the Excel file. Finished Cohort Number and Yearwithout it going too far.
3/16/2017 21:00-54:00 pm: *Worked with Ben and Shrey to pull finish cleaning the cohort data from SDC for all VC funded companies and normalized it . It is ready to put it in an Excel documentbe run through the matcher with Ben.
3/38/2017 1:00-25:30 00 pm: Worked *Matched the VC Data with Ben to try the list of Cohort Companies and repeat down the got one list of all cohort companies that have received VC data without it going too farfunding.
3/610/2017 112:00-42:00 pm: Worked with Shrey to finish cleaning *Put a write-up on the top of the Accelerator wiki page detailing where we are in the cohort project currently as well as what data. It is ready to be run through we have accumulated on the matcher with BenRDP.
3/820/2017 1:00-5:00 pm: Matched the VC Data with *Began gathering the list of Cohort Companies and got one list URLs of all cohort companies that have received VC fundingaccelerators in a TextPad file called Accelerator URLs. Participated in the SQL training session.
3/1022/2017 121:00-25:00 pm: Put a write-up on the top of the *Made tables in Terminal for Accelerator wiki page detailing where we are in the project currently as well as what data we have accumulated on the RDPcompanies matched with VC companies and for Cohort Data.
3/2027/2017 1:00-54:00 pm: Began gathering the *Compiled all URLs of all accelerators in accelerator into a TextPad file called Accelerator URLs. Participated in the SQL training session.
3/2229/2017 1:00-5:00 pm: Made tables in Terminal for Accelerator companies *Worked on the matched data with Ben. Next time I will run the RegEx code that will filter the URLs, and I will look through the duplicates where two different VC companies and for Cohort Databacked company names matched to one cohort company name.
3/2731/2017 1:00-42:00 pm: Compiled all URLs of *Ran the code for accelerator into a TextPad fileurls which are ready to be run through the wayback machine in order to get the start dates. Also began looking through vc backed company names.
4/3/29/2017 1:00-5:00 pm: Worked on the *Continued looking through double matched data with BenVC companies. Next time I will run the RegEx code that will filter the URLs, and I will look through the duplicates where two different VC backed company names matched to one cohort company nameLearned more SQL from Ed.
34/315/2017 1:00-25:00 pm: Ran *Made the code final vc percentage table on terminal and for next time I will collect missing accelerator urls which are ready to be run through the wayback machine in order to get the start dates. Also began looking through vc backed company namesdata.
4/37/2017 1:00-53:00 pm: Continued looking through double matched VC *Began collecting cohort data for big accelerators that were missing from our list in order to add it to our final list of cohort companies. Learned more SQL from Ed.
4/510/2017 1:00-5:00 pm: Made the final vc percentage table on terminal and *Finished gathering cohort company names for next time I will collect big accelerators that we were missing accelerator and put them into the Cleaned Cohort Companies Excel file. Ben is looking through Crunchbase datain order to possibly find more missing accelerators.
4/714/2017 1:00-34:00 pm: *Began collecting cohort data for big working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes on the ones that were missing from our list in order I was able to add it go through. Need to our final list of cohort companiesfinish this textpad before moving forward.
4/1017/2017 1:00-54:00 pm: Finished gathering cohort company names for big *Continued going through potential Crunchbase accelerators that we were missing may have missed. Talked to Ed about getting a more comprehensive list from Excel file and put them into by the end of the semester have the Cleaned Cohort Companies Excel file. Ben is looking through Crunchbase tables and data in order to possibly find more missing acceleratorscollected and done.
4/1419/2017 1:00-4:00 pm: Began working through "Crunchbase Potential Accelerators" textpad that may contain missing *Worked with Jeemin to generate an entire list of potential US accelerators and wrote notes on the ones that I was able from crunchbase. Worked to go through. Need find a way to finish this textpad before moving forwardclassify accelerators just based on their descriptions.
4/1721/2017 : 1:00-4:00 pm: *Continued going working through potential Crunchbase the list identifying accelerators that we may do not have missed. Talked to Ed about getting a more comprehensive list from Excel file and by the end of the semester have the tables Ramee and Juliette are now helping us gather cohort data collected and donefor those missing accelerators.
4/1924/2017 19:00-41:00 pm: Worked with Jeemin to generate an entire list *Updated Veeral on current state of potential US accelerators from crunchbaseproject. Worked to find Typed up a way to classify accelerators just based -do list on their descriptionsthe discussion wiki for Veeral. Got new cohort data on an accelerator and added it to Excel file.
45/213/2017: 111:00-41:00 pm: *Talked to Ed and Anne about future report. Continued working through the list identifying of crunchbase potential accelerators that we do not have. Ramee and Juliette are now helping us gather cohort data Last day of work for those missing acceleratorsthis semester.
4/24/2017 9:00-1:00 pm: Updated Veeral on current state of project. Typed up a to-do list on the discussion wiki for Veeral. Got new cohort data on an accelerator and added it to Excel file.===Fall 2016===
510/317/2017 112016 2:00-15:00 pm: Talked *Created personal wiki page as well as work log; Read about the research project to Ed which I have been assigned; Wrote a short summary of what I believe it is and Anne about future report. Continued working through list of crunchbase potential accelerators. Last day of work for this semester.included some helpful links
910/1118/2017 22016 4:00-56:00 pm: Spoke to Ed about *Met with research partner Shrey who filled me in on where we are with the project going forward. Organized ; Began looking on websites of certain accelerators for how to determine their cohorts and listed these steps on the current updated data for our project.wiki
910/1219/2017 32016 2:00-5:00 pm: Began going through *Finished looking on the Cleaned Cohort Data Excel file remaining accelerator websites and found a few problems with it. Will continue the cleaning process for wrote the rest of steps on determining how to manually locate the weekcohorts.
910/1320/2017 22016 4:00-56:00 pm: Sorted through Cleaned Cohort Data *Met with Peter and finalized our List of Accelerators. We can begin Christy to discuss the process possibility of creating our PercentVC tablea web crawler that will pull data from individual accelerator sites.
910/1424/2017 32016 2:00-5:00 pm: Completely finalized our dataset of accelerators *Brainstormed with Albert and startupsJulia about changes to the category name for SBDE. Met with Michelle Passo Spoke to discuss objectives Ed about full scope of the research for credit courseaccelerator project.
910/1825/2017 22016 4:00-46:00 pm: Talked *Brainstormed with Peter Shrey about the LinkedIn crawler data. Went through VC page that Meghana sent medifferent potential industry focuses within accelerators, as well as different variables to search for in terms of accelerators, startups, cohorts, etc.
910/1926/2017 32016 2:00-5:00 pm: Completed SDC pull *Began searching for more databases including lists of accelerators as well as some characteristics of updated VC Data.those accelerators; Began searching for characteristics that identify accelerators on their websites
910/2027/2017 22016 4:00-56:00 pm: Attempted several times *Continued searching for relevant lists of accelerators to run include on our page. Added some links that have high potential under the Matcher. Cleaned our pulled datatab (Obtained from List of Accelerators or various Google searches).
910/2131/2017 32016 2:00-5:00 pm: Came extremely close *Began constructing a list of variables that clearly distinguish an accelerator on its website. This is in an effort to allow a crawler to running the Matcher the correctly. Reviewed the final LinkedIn data from Petercrawl through many Google searches and identify accelerators.
911/251/2017 22016 4:00-56:00 pm: Finalized the matched file *Continued looking for variables that could identify accelerators from their websites. Searched through numerous different websites of accelerator companies with VC portfolio companies. Gave Ben the data on Georgia acceleratorsobtained from our current databases.
911/262/2017 32016 2:00-54:00 pm: Worked on finding the duplicates *Continued combing through websites of numerous accelerators, well-known and other, in our Matched file in order to have the most accurate datahopes of finding identifying variables.
911/273/2017 22016 4:00-56:00 pm: Attempted *Finalized my list of variables that could be used to find a way to organize distinguish the duplicate matcheswebsites of accelerators. Slightly re-arranged our list of accelerator databases in order of relevance.
911/287/2017 42016 2:00-5:00 pm: Continued running through matched data in order to organize it effectively*Began compiling the list of all accelerators. Created a new TextPad document with information from a new database.
1011/28/2017 22016 4:00-56:00 pm: Talked *Worked with Shrey and Ben in order to Ed about next steps for the project. Practiced accessing the crunchbase database on SQL. Brushed up compile all of our accelerator databases into one long list on SQL codeTextpad.
1011/39/2017 32016 2:00-5:00 pm: Searched the *Continued formulating a database for crunchbase investment informationall accelerators and all of the available info given.
11/10/2016 4/2017 2:00-56:00 pm: Pulled the funding rounds table from SQL *Worked with Shrey and matched it with the companies that have received VC funding Peter in order to gather round datesdevelop a crawler for f6s.
1011/614/2017 32016 2:00-5:00 pm: Went through *Began sorting the matched data. Brainstormed ways to get the dates for cohort companies going through acceleratorsSeed-DB database in an Excel document.
1011/1115/2017 22016 4:00-36:30 00 pm: Looked into using the WhoIs Parser *Conducted some Google searches in order an attempt to find when the companies went more accelerator databases. Began looking through their acceleratorsExecutive Orders searching for keywords.
1011/1216/2017 32016 2:00-5:00 pm: Discovered that the Wayback Machine will not be a good option for finding when companies went through their accelerators. Created a list of VCCompanies and their earliest round date. Included a column for the date they went *Completed searching through their accelerators and will fill it in when we find a good method of finding this dateExecutive Orders.
1011/1617/2017 22016 4:00-36:30 00 pm: *Continued working on sorting VCCompanies by their earliest round dateGoogle searches for state accelerator list. Looked through f6s for common words that can be used to distinguish accelerators once we have finalized the crawler.
1011/1721/2017 32016 2:00-5:00 pm: Worked with Ben *Randomly chose 10 accelerators from Excel list of accelerators on the RDP. Went through each website and listed the steps that I took in order to find a solution determine whether or not the website belonged to our problem of data acquisitionan accelerator. Finalized earliest round date for VCCompaniesWill continue extracting cohort information tomorrow.
1011/1822/2017 22016 4:00-56:00 pm: Updated our VC data *Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. Worked with Ed's help Peter in order to increase build a tool that will search all of the accuracy HTMLs and completion of our dataattempt to identify each one as an accelerator as well as extract some basic information.
1011/1928/2017 32016 2:00-5:00 pm: Organized all of *Merged the F6S accelerator list with our matched data and updated other list, then posted it in order to reflect on the most recent SDC pull with Edproject page. Matched Crunchbase Learned process for accelerator data with our cohort companiesextraction from Ed.
1011/2029/2017 22016 4:00-36:30 00 pm: Generated *Began process of collecting data from the new list of VCCompanies as well as their earliest round dates20 accelerators that I am responsible for.
1011/2330/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished 15/20. 12/1/2016 4:00-6:00 pm*Continued collecting data from accelerators. Finished original 20, picked up a new set of 20. 12/2017 2/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished next 20. 12/8/2016 1:00-3:30 00 pm: Worked on sorting out *Completed collecting data from accelerators for the discrepancies in our matched datasemester.
[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]]
[[Category:Work Log]]

Navigation menu