Changes

Jump to navigation Jump to search
no edit summary
10/17/2016 2:00-5:00 pm: Created personal wiki page as well as work log; Read about the research project to which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links===Fall 2017===<onlyinclude>
10[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]] 9/11/2017 2:00-5:00 pm*Spoke to Ed about the project going forward. Organized the current updated data for our project. 9/12/2017 3:00-5:00 pm*Began going through the Cleaned Cohort Data Excel file and found a few problems with it. Will continue the cleaning process for the rest of the week. 9/1813/2016 42017 2:00-65:00 pm*Sorted through Cleaned Cohort Data and finalized our List of Accelerators. We can begin the process of creating our PercentVC table. 9/14/2017 3: 00-5:00 pm*Completely finalized our dataset of accelerators and startups. Met with Michelle Passo to discuss objectives of the research partner Shrey who filled for credit course. 9/18/2017 2:00-4:00 pm*Talked with Peter about the LinkedIn crawler data. Went through VC page that Meghana sent me in on where we are . 9/19/2017 3:00-5:00 pm*Completed SDC pull of updated VC Data. 9/20/2017 2:00-5:00 pm*Attempted several times to run the Matcher. Cleaned our pulled data. 9/21/2017 3:00-5:00 pm*Came extremely close to running the Matcher the correctly. Reviewed the final LinkedIn data from Peter. 9/25/2017 2:00-5:00 pm*Finalized the matched file of accelerator companies with VC portfolio companies. Gave Ben the project; Began looking data on websites of certain Georgia accelerators for how . 9/26/2017 3:00-5:00 pm*Worked on finding the duplicates in our Matched file in order to have the most accurate data. 9/27/2017 2:00-5:00 pm*Attempted to find a way to organize the duplicate matches. 9/28/2017 4:00-5:00 pm*Continued running through matched data in order to organize it effectively. 10/2/2017 2:00-5:00 pm*Talked to determine their cohorts and listed these Ed about next steps for the project. Practiced accessing the crunchbase database on SQL. Brushed up on SQL code. 10/3/2017 3:00-5:00 pm*Searched the wikidatabase for crunchbase investment information. 10/4/2017 2:00-5:00 pm*Pulled the funding rounds table from SQL and matched it with the companies that have received VC funding in order to gather round dates.
10/196/2016 22017 3:00-5:00 pm: Finished looking on the remaining accelerator websites and wrote *Went through the steps on determining how matched data. Brainstormed ways to manually locate get the cohortsdates for cohort companies going through accelerators.
10/2011/2016 42017 2:00-63:00 30 pm: Met with Peter and Christy *Looked into using the WhoIs Parser in order to discuss find when the possibility of creating a web crawler that will pull data from individual accelerator sitescompanies went through their accelerators.
10/2412/2016 22017 3:00-5:00 pm: Brainstormed with Albert and Julia about changes to *Discovered that the category name Wayback Machine will not be a good option for SBDEfinding when companies went through their accelerators. Created a list of VCCompanies and their earliest round date. Spoke to Ed about full scope Included a column for the date they went through their accelerators and will fill it in when we find a good method of accelerator projectfinding this date.
10/2516/2016 42017 2:00-63:00 30 pm: Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables to search for in terms of accelerators, startups, cohorts, etc*Continued working on sorting VCCompanies by their earliest round date.
10/2617/2016 22017 3:00-5:00 pm: Began searching for more databases including lists of accelerators as well as some characteristics *Worked with Ben to find a solution to our problem of those accelerators; Began searching data acquisition. Finalized earliest round date for characteristics that identify accelerators on their websitesVCCompanies.
10/2718/2016 42017 2:00-65:00 pm: Continued searching for relevant lists of accelerators *Updated our VC data with Ed's help in order to include on our page. Added some links that have high potential under increase the tab (Obtained from List accuracy and completion of Accelerators or various Google searches)our data.
10/3119/2016 22017 3:00-5:00 pm: Began constructing a list *Organized all of variables that clearly distinguish an accelerator on its website. This is our matched data and updated it in an effort order to allow a crawler to crawl through many Google searches and identify acceleratorsreflect the most recent SDC pull with Ed. Matched Crunchbase data with our cohort companies.
1110/120/2016 42017 2:00-63:00 30 pm: Continued looking for variables that could identify accelerators from *Generated the new list of VCCompanies as well as their websites. Searched through numerous different websites of accelerators obtained from our current databasesearliest round dates.
1110/223/2016 2017 2:00-43:00 30 pm: Continued combing through websites of numerous accelerators, well-known and other, *Worked on sorting out the discrepancies in the hopes of finding identifying variablesour matched data.
1110/24/2017 3/2016 4:00-65:00 pm: Finalized my *Went through list of variables that could be used to distinguish the websites of VCCompanies and began adding respective accelerators. Slightly re-arranged our list of accelerator databases in order of relevanceto proceed with VCPercentage table.
1110/725/2016 2017 2:00-5:00 pm: Began compiling the *Continued going through list of all VCCompanies and adding accelerators. Created a new TextPad document with information from a new database.
1110/826/2016 42017 3:0030-65:00 30 pm: Worked with Shrey *Continued going through list of VCCompanies and Ben in order to compile all of our accelerator databases into one long list adding accelerators. Will have this completed on TextpadMonday.
1110/930/2016 2017 2:00-53:00 30 pm: Continued formulating a database for *Finished adding all of the accelerators and all to the list of VCCompanies. Added a column indicating whether or not the available info givencompany went through two or more accelerators.
1110/1031/2016 42017 3:00-65:00 pm: Worked with Shrey and Peter *Began compiling data in order to develop a crawler the column for f6sDate Company went through Accelerator.
11/141/2016 2017 2:00-54:00 pm: Began sorting the Seed-DB database in an Excel document*Finalized entering dates for Y Combinator cohort companies.
11/152/2016 2017 4:00-65:00 30 pm: Conducted some Google searches in an attempt to find more accelerator databases. Began looking through Executive Orders searching for keywords*Continued entering cohort company dates into Excel file.
11/166/2016 2017 2:00-54:00 pm: Completed searching through Executive Orders*Continued entering cohort company dates into Excel file. Began compiling a list of keywords for demo day press releases.
11/177/2016 42017 3:00-65:00 pm: Continued working on Google searches *Finished coming up with keywords for state accelerator demo day crawler. Sent the final list. Looked through f6s for common words that can be used to distinguish accelerators once we have finalized the crawlerPeter.
11/218/2016 2017 2:00-53:00 30 pm: Randomly chose 10 accelerators from Excel list *Spoke to Ed and organized all of accelerators on the RDP. Went through each website and listed the steps that I took in order to determine whether or not the website belonged to an accelerator. Will continue extracting cohort information tomorrowour current data.
11/229/2016 42017 3:00-65:00 pm: Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. Worked with Peter in order to build *Created a tool that will search new project page called Accelerator Data and listed all of the HTMLs and attempt to identify each one as an accelerator relevant files as well as extract some basic informationdescriptions.
11/2814/2016 22017 3:00-5:00 pm: Merged the F6S accelerator list with our other list, then posted it on *Looked up URLs and decided whether or not the project page. Learned process for accelerator data extraction from Edwebiste was relevant.
11/2915/2016 42017 2:00-65:00 pm: Began process *Created SQL database entitled "acceleratordata" and began creating tables from folder of collecting data from the 20 accelerators that I am responsible forAll Relevant Files.
11/3016/2016 22017 3:00-5:00 pm: *Continued collecting data from accelerators. Finished 15/20to input tables into SQL database.
1211/120/2016 42017 2:00-65:00 pm: Continued collecting data from accelerators. Finished original 20, picked up a new set of 20*Cleaned text files in order to import tables into SQL database.
1211/227/2016 2017 2:00-5:00 pm: Continued collecting data from accelerators*Worked with Peter to find and exclude irrelevant keywords on HTML pages. Finished next 20Began categorizing relevant demo day pages.
1211/828/2016 12017 3:00-35:00 pm: Completed collecting data from accelerators for the semester*Finished inputting tables of relevant files into SQL database.
111/1829/2017 12:00-5:00 pm: Continued collecting data for *Went through accelerator projectHTML URLs. Helped Catherine draft tweets for the McNair Center twitter accountSpoke with Ed about going through HTMLs and classifying based on overall and specific relevance.
12/1/20/2017 13:00-35:00 pm: Continued collecting data *Worked through accelerator links and classified pages based on accelerators. Attended McNair Center team meetingwhether or not they provided relevant information about startup timing.
112/234/2017 110:00-512:00 pm: Began combing *Continued running through accelerator list, determining which accelerators are still missing data demo day crawl URLs and documenting these in a TextPad file. Finished through #115scoring them based on relevance.
112/257/2017 1:00-54:00 30 pm: Continued looking through accelerator list*Finalized scoring of demo day URLs for the original crawl. Last day of work for this semester.
1</27/2017 1:00-3:00 pm: Continued going through accelerator list. Left off on #226 with Shrey.onlyinclude>
1/20/===Spring 2017 1:00-5:00 pm: Continued going through accelerator list. Finished through #440.===
21/118/2017 1:00-5:00 pm: Finished going through the list of accelerators looking *Continued collecting data for incomplete filesaccelerator project. Began completing Helped Catherine draft tweets for the files that were not doneMcNair Center twitter account.
21/320/2017 1:00-3:00 pm: *Continued working collecting data on completing accelerator filesaccelerators. Attended McNair Center team meeting.
21/623/2017 1:00-45:30 00 pm: Finished data set of accelerators. *Began going combing through accelerator list, determining which accelerators are still missing data and making sure that all text files and cohort files are of the same format so Peter can easily pull the informationdocumenting these in a TextPad file. Left for 30 minutes for an interview from 2:30-3:00 pmFinished through #115.
21/825/2017 1:00-5:00 pm: Finished formatting *Continued looking through #137. Spoke with Ed about projectaccelerator list.
21/1327/2017 1:00-53:00 pm: Completed formatting for all *Continued going through accelerator text fileslist. Left off on #226 with Shrey.
21/1520/2017 31:00-5:00 pm: Made copy of the completed data set*Continued going through accelerator list. Spoke to Ed about future steps to take for project including gathering founder data and obtaining the crunchbase apiFinished through #440.
2/171/2017 1:00-35:00 pm: Went *Finished going through final Excel spreadsheet for cohort information. Still need to run the crawler one more time after the completion list of the editing processaccelerators looking for incomplete files. Found the application for Began completing the crunchbase api which will hopefully allow us to gain accessfiles that were not done.
2/203/2017 1:00-53:00 pm: Filled out another application for Crunchbase research access; Found the first source for the incubator project *Continued working on angelcompleting accelerator files.co, will hopefully work with Peter to make a crawler similar to f6s
2/226/2017 1:00-54:00 30 pm: Pulled *Finished data from SDC for Ed set of accelerators. Began going through and normalized it. Learned how to use SDC making sure that all text files and cohort files are of the same format so Peter can easily pull the normalizerinformation. Left for 30 minutes for an interview from 2:30-3:00 pm.
2/248/2017 1:00-35:00 pm: *Finished cleaning up the cohort data for Y-combinator on the Final Cohort Excel Spreadsheetformatting through #137. Spoke with Ed about project.
2/2713/2017 1:00-5:00 pm: Continued cleaning up the cohort data in the Excel file. Finished Cohort Number and Year*Completed formatting for all accelerator text files.
32/115/2017 23:00-5:00 pm: Worked with Ben and Shrey *Made copy of the completed data set. Spoke to Ed about future steps to pull take for project including gathering founder data from SDC for all VC funded companies and normalized it to put it in an Excel documentobtaining the crunchbase api.
32/317/2017 1:00-23:30 00 pm: Worked with Ben *Went through final Excel spreadsheet for cohort information. Still need to try and repeat down run the crawler one more time after the completion of the editing process. Found the VC data without it going too farapplication for the crunchbase api which will hopefully allow us to gain access.
32/620/2017 1:00-45:00 pm: Worked *Filled out another application for Crunchbase research access; Found the first source for the incubator project on angel.co, will hopefully work with Shrey Peter to finish cleaning the cohort data. It is ready make a crawler similar to be run through the matcher with Ben.f6s
32/822/2017 1:00-5:00 pm: Matched *Pulled data from SDC for Ed and normalized it. Learned how to use SDC and the VC Data with the list of Cohort Companies and got one list of all cohort companies that have received VC fundingnormalizer.
32/1024/2017 121:00-23:00 pm: Put a write-*Finished cleaning up on the top of the Accelerator wiki page detailing where we are in the project currently as well as what cohort data we have accumulated for Y-combinator on the RDPFinal Cohort Excel Spreadsheet.
32/2027/2017 1:00-5:00 pm: Began gathering *Continued cleaning up the URLs of all accelerators cohort data in a TextPad the Excel file called Accelerator URLs. Participated in the SQL training sessionFinished Cohort Number and Year.
3/221/2017 12:00-5:00 pm: Made tables in Terminal *Worked with Ben and Shrey to pull data from SDC for Accelerator companies matched with all VC funded companies and for Cohort Datanormalized it to put it in an Excel document.
3/273/2017 1:00-42:00 30 pm: Compiled all URLs of accelerator into a TextPad file*Worked with Ben to try and repeat down the VC data without it going too far.
3/296/2017 1:00-54:00 pm: *Worked on with Shrey to finish cleaning the matched cohort data with Ben. Next time I will It is ready to be run the RegEx code that will filter the URLs, and I will look through the duplicates where two different VC backed company names matched to one cohort company namematcher with Ben.
3/318/2017 1:00-25:00 pm: Ran the code for accelerator urls which are ready to be run through *Matched the wayback machine in order to get VC Data with the start dates. Also began looking through vc backed company nameslist of Cohort Companies and got one list of all cohort companies that have received VC funding.
43/310/2017 112:00-52:00 pm: Continued looking through double matched VC companies. Learned more SQL from Ed*Put a write-up on the top of the Accelerator wiki page detailing where we are in the project currently as well as what data we have accumulated on the RDP.
43/520/2017 1:00-5:00 pm: Made *Began gathering the URLs of all accelerators in a TextPad file called Accelerator URLs. Participated in the final vc percentage table on terminal and for next time I will collect missing accelerator dataSQL training session.
43/722/2017 1:00-35:00 pm: Began collecting cohort data *Made tables in Terminal for big accelerators that were missing from our list in order to add it to our final list of cohort Accelerator companies matched with VC companiesand for Cohort Data.
43/1027/2017 1:00-54:00 pm: Finished gathering cohort company names for big accelerators that we were missing and put them *Compiled all URLs of accelerator into the Cleaned Cohort Companies Excel a TextPad file. Ben is looking through Crunchbase data in order to possibly find more missing accelerators.
43/1429/2017 1:00-45:00 pm: Began working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes *Worked on the ones matched data with Ben. Next time I will run the RegEx code that will filter the URLs, and I was able to go will look through. Need the duplicates where two different VC backed company names matched to finish this textpad before moving forwardone cohort company name.
43/1731/2017 1:00-42:00 pm: Continued going *Ran the code for accelerator urls which are ready to be run through potential Crunchbase accelerators that we may have missed. Talked the wayback machine in order to Ed about getting a more comprehensive list from Excel file and by get the end of the semester have the tables and data collected and donestart dates. Also began looking through vc backed company names.
4/193/2017 1:00-45:00 pm: Worked with Jeemin to generate an entire list of potential US accelerators *Continued looking through double matched VC companies. Learned more SQL from crunchbase. Worked to find a way to classify accelerators just based on their descriptionsEd.
4/215/2017: 1:00-45:00 pm: Continued working through *Made the list identifying accelerators that we do not have. Ramee final vc percentage table on terminal and Juliette are now helping us gather cohort data for those next time I will collect missing acceleratorsaccelerator data.
4/247/2017 91:00-13:00 pm: Updated Veeral on current state of project. Typed up a to-do list on the discussion wiki for Veeral. Got new *Began collecting cohort data on an accelerator and added for big accelerators that were missing from our list in order to add it to Excel fileour final list of cohort companies.
54/310/2017 111:00-15:00 pm: Talked to Ed *Finished gathering cohort company names for big accelerators that we were missing and Anne about future reportput them into the Cleaned Cohort Companies Excel file. Continued working Ben is looking through list of crunchbase potential Crunchbase data in order to possibly find more missing accelerators. Last day of work for this semester.
94/1114/2017 21:00-54:00 pm: Spoke *Began working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes on the ones that I was able to go through. Need to Ed about the project going finish this textpad before moving forward. Organized the current updated data for our project.
94/1217/2017 31:00-54:00 pm: Began *Continued going through the Cleaned Cohort Data potential Crunchbase accelerators that we may have missed. Talked to Ed about getting a more comprehensive list from Excel file and found a few problems with it. Will continue by the cleaning process for end of the rest of semester have the weektables and data collected and done.
94/1319/2017 21:00-54:00 pm: Sorted through Cleaned Cohort Data and finalized our List *Worked with Jeemin to generate an entire list of Acceleratorspotential US accelerators from crunchbase. We can begin the process of creating our PercentVC tableWorked to find a way to classify accelerators just based on their descriptions.
94/1421/2017 3: 1:00-54:00 pm: Completely finalized our dataset of *Continued working through the list identifying accelerators that we do not have. Ramee and startups. Met with Michelle Passo to discuss objectives of the research Juliette are now helping us gather cohort data for credit coursethose missing accelerators.
94/1824/2017 29:00-41:00 pm: Talked with Peter about *Updated Veeral on current state of project. Typed up a to-do list on the LinkedIn crawler discussion wiki for Veeral. Got new cohort data. Went through VC page that Meghana sent meon an accelerator and added it to Excel file.
95/193/2017 311:00-51:00 pm: Completed SDC pull *Talked to Ed and Anne about future report. Continued working through list of crunchbase potential accelerators. Last day of updated VC Datawork for this semester.
9/20/2017 2:00-5:00 pm: Attempted several times to run the Matcher. Cleaned our pulled data.===Fall 2016===
910/2117/2017 32016 2:00-5:00 pm: Came extremely close *Created personal wiki page as well as work log; Read about the research project to running the Matcher the correctly. Reviewed the final LinkedIn data from Peter.which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links
910/2518/2017 22016 4:00-56:00 pm: Finalized the matched file of accelerator companies *Met with research partner Shrey who filled me in on where we are with VC portfolio companies. Gave Ben the data project; Began looking on Georgia websites of certain accelerators.for how to determine their cohorts and listed these steps on the wiki
910/2619/2017 32016 2:00-5:00 pm: Worked *Finished looking on finding the duplicates in our Matched file in order remaining accelerator websites and wrote the steps on determining how to have manually locate the most accurate datacohorts.
910/2720/2017 22016 4:00-56:00 pm: Attempted *Met with Peter and Christy to find discuss the possibility of creating a way to organize the duplicate matchesweb crawler that will pull data from individual accelerator sites.
910/2824/2017 42016 2:00-5:00 pm: Continued running through matched data in order *Brainstormed with Albert and Julia about changes to the category name for SBDE. Spoke to organize it effectivelyEd about full scope of accelerator project.
10/25/2016 4:00-6:00 pm*Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables to search for in terms of accelerators, startups, cohorts, etc. 10/26/2016 2:00-5:00 pm*Began searching for more databases including lists of accelerators as well as some characteristics of those accelerators; Began searching for characteristics that identify accelerators on their websites 10/27/2016 4:00-6:00 pm*Continued searching for relevant lists of accelerators to include on our page. Added some links that have high potential under the tab (Obtained from List of Accelerators or various Google searches). 10/31/2016 2:00-5:00 pm*Began constructing a list of variables that clearly distinguish an accelerator on its website. This is in an effort to allow a crawler to crawl through many Google searches and identify accelerators. 11/1/2016 4:00-6:00 pm*Continued looking for variables that could identify accelerators from their websites. Searched through numerous different websites of accelerators obtained from our current databases. 11/2/2016 2:00-4:00 pm*Continued combing through websites of numerous accelerators, well-known and other, in the hopes of finding identifying variables. 11/3/2016 4:00-6:00 pm*Finalized my list of variables that could be used to distinguish the websites of accelerators. Slightly re-arranged our list of accelerator databases in order of relevance. 11/7/2016 2:00-5:00 pm*Began compiling the list of all accelerators. Created a new TextPad document with information from a new database. 11/2017 8/2016 4:00-6:00 pm*Worked with Shrey and Ben in order to compile all of our accelerator databases into one long list on Textpad. 11/9/2016 2:00-5:00 pm*Continued formulating a database for all accelerators and all of the available info given. 11/10/2016 4: Talked 00-6:00 pm*Worked with Shrey and Peter in order to Ed about next develop a crawler for f6s. 11/14/2016 2:00-5:00 pm*Began sorting the Seed-DB database in an Excel document. 11/15/2016 4:00-6:00 pm*Conducted some Google searches in an attempt to find more accelerator databases. Began looking through Executive Orders searching for keywords. 11/16/2016 2:00-5:00 pm*Completed searching through Executive Orders. 11/17/2016 4:00-6:00 pm*Continued working on Google searches for state accelerator list. Looked through f6s for common words that can be used to distinguish accelerators once we have finalized the crawler. 11/21/2016 2:00-5:00 pm*Randomly chose 10 accelerators from Excel list of accelerators on the RDP. Went through each website and listed the steps that I took in order to determine whether or not the website belonged to an accelerator. Will continue extracting cohort information tomorrow. 11/22/2016 4:00-6:00 pm*Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. Worked with Peter in order to build a tool that will search all of the HTMLs and attempt to identify each one as an accelerator as well as extract some basic information. 11/28/2016 2:00-5:00 pm*Merged the F6S accelerator list with our other list, then posted it on the projectpage. Practiced accessing Learned process for accelerator data extraction from Ed. 11/29/2016 4:00-6:00 pm*Began process of collecting data from the crunchbase database on SQL20 accelerators that I am responsible for. 11/30/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished 15/20. 12/1/2016 4:00-6:00 pm*Continued collecting data from accelerators. Brushed Finished original 20, picked up on SQL codea new set of 20. 12/2/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished next 20. 12/8/2016 1:00-3:00 pm*Completed collecting data from accelerators for the semester.
[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]]
[[Category:Work Log]]

Navigation menu