Changes

Jump to navigation Jump to search
no edit summary
10/17/2016 2:00-5:00 pm: Created personal wiki page as well as work log; Read about the research project to which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links===Fall 2017===<onlyinclude>
10[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]] 9/1811/2016 42017 2:00-65:00 pm*Spoke to Ed about the project going forward. Organized the current updated data for our project. 9/12/2017 3:00-5: Met with research partner Shrey who filled me in on where we are 00 pm*Began going through the Cleaned Cohort Data Excel file and found a few problems with it. Will continue the cleaning process for the project; Began looking on websites rest of certain accelerators for how to determine their cohorts and listed these steps on the wikiweek.
109/1913/2016 2017 2:00-5:00 pm: Finished looking on the remaining accelerator websites *Sorted through Cleaned Cohort Data and wrote the steps on determining how to manually locate finalized our List of Accelerators. We can begin the cohortsprocess of creating our PercentVC table.
109/2014/2016 42017 3:00-65:00 pm: *Completely finalized our dataset of accelerators and startups. Met with Peter and Christy Michelle Passo to discuss objectives of the possibility of creating a web crawler that will pull data from individual accelerator sitesresearch for credit course.
109/2418/2016 2017 2:00-54:00 pm: Brainstormed *Talked with Albert and Julia Peter about changes to the category name for SBDELinkedIn crawler data. Spoke to Ed about full scope of accelerator projectWent through VC page that Meghana sent me.
109/2519/2016 42017 3:00-65:00 pm: Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables to search for in terms *Completed SDC pull of accelerators, startups, cohorts, etcupdated VC Data.
109/2620/2016 2017 2:00-5:00 pm: Began searching for more databases including lists of accelerators as well as some characteristics of those accelerators; Began searching for characteristics that identify accelerators on their websites*Attempted several times to run the Matcher. Cleaned our pulled data.
109/2721/2016 42017 3:00-65:00 pm: Continued searching for relevant lists of accelerators *Came extremely close to include on our pagerunning the Matcher the correctly. Added some links that have high potential under Reviewed the tab (Obtained final LinkedIn data from List of Accelerators or various Google searches)Peter.
109/3125/2016 2017 2:00-5:00 pm: Began constructing a list *Finalized the matched file of variables that clearly distinguish an accelerator companies with VC portfolio companies. Gave Ben the data on its website. This is in an effort to allow a crawler to crawl through many Google searches and identify Georgia accelerators.
119/126/2016 42017 3:00-65:00 pm: Continued looking for variables that could identify accelerators from their websites. Searched through numerous different websites of accelerators obtained from *Worked on finding the duplicates in our current databasesMatched file in order to have the most accurate data.
119/227/2016 2017 2:00-45:00 pm: Continued combing through websites of numerous accelerators, well-known and other, in *Attempted to find a way to organize the hopes of finding identifying variablesduplicate matches.
119/328/2016 2017 4:00-65:00 pm: Finalized my list of variables that could be used to distinguish the websites of accelerators. Slightly re-arranged our list of accelerator databases *Continued running through matched data in order of relevanceto organize it effectively.
1110/72/2016 2017 2:00-5:00 pm: Began compiling *Talked to Ed about next steps for the list of all acceleratorsproject. Created a new TextPad document with information from a new Practiced accessing the crunchbase databaseon SQL. Brushed up on SQL code.
1110/83/2016 42017 3:00-65:00 pm: Worked with Shrey and Ben in order to compile all of our accelerator databases into one long list on Textpad*Searched the database for crunchbase investment information.
1110/94/2016 2017 2:00-5:00 pm: Continued formulating a database for all accelerators *Pulled the funding rounds table from SQL and all of matched it with the available info givencompanies that have received VC funding in order to gather round dates.
1110/106/2016 42017 3:00-65:00 pm: Worked with Shrey and Peter in order *Went through the matched data. Brainstormed ways to develop a crawler get the dates for f6scohort companies going through accelerators.
10/11/14/2016 2017 2:00-53:00 30 pm: Began sorting *Looked into using the Seed-DB database WhoIs Parser in an Excel documentorder to find when the companies went through their accelerators.
1110/1512/2016 42017 3:00-65:00 pm: Conducted some Google searches *Discovered that the Wayback Machine will not be a good option for finding when companies went through their accelerators. Created a list of VCCompanies and their earliest round date. Included a column for the date they went through their accelerators and will fill it in an attempt to when we find more accelerator databases. Began looking through Executive Orders searching for keywordsa good method of finding this date.
1110/16/2016 2017 2:00-53:00 30 pm: Completed searching through Executive Orders*Continued working on sorting VCCompanies by their earliest round date.
1110/17/2016 42017 3:00-65:00 pm: Continued working on Google searches for state accelerator list*Worked with Ben to find a solution to our problem of data acquisition. Looked through f6s Finalized earliest round date for common words that can be used to distinguish accelerators once we have finalized the crawlerVCCompanies.
1110/2118/2016 2017 2:00-5:00 pm: Randomly chose 10 accelerators from Excel list of accelerators on the RDP. Went through each website and listed the steps that I took *Updated our VC data with Ed's help in order to determine whether or not increase the website belonged to an accelerator. Will continue extracting cohort information tomorrowaccuracy and completion of our data.
1110/2219/2016 42017 3:00-65:00 pm: Listed out *Organized all steps for extracting cohort information from the ten randomly chosen accelerators. Worked with Peter of our matched data and updated it in order to build a tool that will search all of reflect the HTMLs and attempt to identify each one as an accelerator as well as extract some basic informationmost recent SDC pull with Ed. Matched Crunchbase data with our cohort companies.
1110/2820/2016 2017 2:00-53:00 30 pm: Merged *Generated the F6S accelerator list with our other new list, then posted it on the project page. Learned process for accelerator data extraction from Edof VCCompanies as well as their earliest round dates.
1110/2923/2016 42017 2:00-63:00 30 pm: Began process of collecting *Worked on sorting out the discrepancies in our matched data from the 20 accelerators that I am responsible for.
1110/3024/2016 22017 3:00-5:00 pm: Continued collecting data from *Went through list of VCCompanies and began adding respective accelerators. Finished 15/20in order to proceed with VCPercentage table.
1210/125/2016 42017 2:00-65:00 pm: *Continued collecting data from going through list of VCCompanies and adding accelerators. Finished original 20, picked up a new set of 20.
1210/226/2016 22017 3:0030-5:00 30 pm: *Continued collecting data from going through list of VCCompanies and adding accelerators. Finished next 20Will have this completed on Monday.
1210/830/2016 12017 2:00-3:00 30 pm: Completed collecting data from *Finished adding all of the accelerators for to the semesterlist of VCCompanies. Added a column indicating whether or not the company went through two or more accelerators.
110/1831/2017 13:00-5:00 pm: Continued collecting *Began compiling data in the column for accelerator project. Helped Catherine draft tweets for the McNair Center twitter accountDate Company went through Accelerator.
11/1/20/2017 12:00-34:00 pm: Continued collecting data on accelerators. Attended McNair Center team meeting*Finalized entering dates for Y Combinator cohort companies.
111/232/2017 14:00-5:00 30 pm: Began combing through accelerator list, determining which accelerators are still missing data and documenting these in a TextPad *Continued entering cohort company dates into Excel file. Finished through #115.
111/256/2017 12:00-54:00 pm: *Continued looking through accelerator entering cohort company dates into Excel file. Began compiling a listof keywords for demo day press releases.
111/277/2017 13:00-35:00 pm: Continued going through accelerator *Finished coming up with keywords for demo day crawler. Sent the final list. Left off on #226 with Shreyto Peter.
111/208/2017 12:00-53:00 30 pm: Continued going through accelerator list. Finished through #440*Spoke to Ed and organized all of our current data.
211/19/2017 13:00-5:00 pm: Finished going through the list of accelerators looking for incomplete files. Began completing the *Created a new project page called Accelerator Data and listed all relevant files that were not doneas well as descriptions.
211/314/2017 13:00-35:00 pm: Continued working on completing accelerator files*Looked up URLs and decided whether or not the webiste was relevant.
211/615/2017 12:00-45:30 00 pm: Finished data set of accelerators. Began going through *Created SQL database entitled "acceleratordata" and making sure that all text files and cohort files are began creating tables from folder of the same format so Peter can easily pull the information. Left for 30 minutes for an interview from 2:30-3:00 pmAll Relevant Files.
211/816/2017 13:00-5:00 pm: Finished formatting through #137. Spoke with Ed about project*Continued to input tables into SQL database.
211/1320/2017 12:00-5:00 pm: Completed formatting for all accelerator *Cleaned text filesin order to import tables into SQL database.
211/1527/2017 32:00-5:00 pm: Made copy of the completed data set. Spoke *Worked with Peter to Ed about future steps to take for project including gathering founder data find and obtaining the crunchbase apiexclude irrelevant keywords on HTML pages. Began categorizing relevant demo day pages.
211/1728/2017 13:00-35:00 pm: Went through final Excel spreadsheet for cohort information. Still need to run the crawler one more time after the completion *Finished inputting tables of the editing process. Found the application for the crunchbase api which will hopefully allow us to gain accessrelevant files into SQL database.
211/2029/2017 12:00-5:00 pm: Filled out another application for Crunchbase research access; Found the first source for the incubator project *Went through accelerator HTML URLs. Spoke with Ed about going through HTMLs and classifying based on angeloverall and specific relevance.co, will hopefully work with Peter to make a crawler similar to f6s
212/221/2017 13:00-5:00 pm: Pulled data from SDC for Ed and normalized it. Learned how to use SDC *Worked through accelerator links and the normalizerclassified pages based on whether or not they provided relevant information about startup timing.
212/244/2017 110:00-312:00 pm: Finished cleaning up the cohort data for Y-combinator *Continued running through demo day crawl URLs and scoring them based on the Final Cohort Excel Spreadsheetrelevance.
212/277/2017 1:00-54:00 30 pm: Continued cleaning up the cohort data in *Finalized scoring of demo day URLs for the Excel fileoriginal crawl. Finished Cohort Number and YearLast day of work for this semester.
3</1/2017 2:00-5:00 pm: Worked with Ben and Shrey to pull data from SDC for all VC funded companies and normalized it to put it in an Excel document.onlyinclude>
3/3/===Spring 2017 1:00-2:30 pm: Worked with Ben to try and repeat down the VC data without it going too far.===
31/618/2017 1:00-45:00 pm: Worked with Shrey to finish cleaning the cohort *Continued collecting datafor accelerator project. It is ready to be run through Helped Catherine draft tweets for the matcher with BenMcNair Center twitter account.
31/820/2017 1:00-53:00 pm: Matched the VC Data with the list of Cohort Companies and got one list of all cohort companies that have received VC funding*Continued collecting data on accelerators. Attended McNair Center team meeting.
31/1023/2017 121:00-25:00 pm: Put a write-up on the top of the Accelerator wiki page detailing where we *Began combing through accelerator list, determining which accelerators are still missing data and documenting these in the project currently as well as what data we have accumulated on the RDPa TextPad file. Finished through #115.
31/2025/2017 1:00-5:00 pm: Began gathering the URLs of all accelerators in a TextPad file called Accelerator URLs. Participated in the SQL training session*Continued looking through accelerator list.
31/2227/2017 1:00-53:00 pm: Made tables in Terminal for Accelerator companies matched *Continued going through accelerator list. Left off on #226 with VC companies and for Cohort DataShrey.
31/2720/2017 1:00-45:00 pm: Compiled all URLs of *Continued going through accelerator into a TextPad filelist. Finished through #440.
32/291/2017 1:00-5:00 pm: Worked on *Finished going through the matched data with Benlist of accelerators looking for incomplete files. Next time I will run Began completing the RegEx code files that will filter the URLs, and I will look through the duplicates where two different VC backed company names matched to one cohort company namewere not done.
2/3/31/2017 1:00-23:00 pm: Ran the code for *Continued working on completing accelerator urls which are ready to be run through the wayback machine in order to get the start dates. Also began looking through vc backed company namesfiles.
42/36/2017 1:00-54:00 30 pm: Continued looking *Finished data set of accelerators. Began going through double matched VC companiesand making sure that all text files and cohort files are of the same format so Peter can easily pull the information. Learned more SQL Left for 30 minutes for an interview from Ed2:30-3:00 pm.
42/58/2017 1:00-5:00 pm: Made the final vc percentage table on terminal and for next time I will collect missing accelerator data*Finished formatting through #137. Spoke with Ed about project.
42/713/2017 1:00-35:00 pm: Began collecting cohort data *Completed formatting for big accelerators that were missing from our list in order to add it to our final list of cohort companiesall accelerator text files.
42/1015/2017 13:00-5:00 pm: Finished *Made copy of the completed data set. Spoke to Ed about future steps to take for project including gathering cohort company names for big accelerators that we were missing founder data and put them into obtaining the Cleaned Cohort Companies Excel file. Ben is looking through Crunchbase data in order to possibly find more missing acceleratorscrunchbase api.
42/1417/2017 1:00-43:00 pm: Began working *Went through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes on final Excel spreadsheet for cohort information. Still need to run the ones that I was able to go throughcrawler one more time after the completion of the editing process. Need Found the application for the crunchbase api which will hopefully allow us to finish this textpad before moving forwardgain access.
42/1720/2017 1:00-45:00 pm: Continued going through potential *Filled out another application for Crunchbase accelerators that we may have missedresearch access; Found the first source for the incubator project on angel. Talked co, will hopefully work with Peter to Ed about getting make a more comprehensive list from Excel file and by the end of the semester have the tables and data collected and done.crawler similar to f6s
42/1922/2017 1:00-45:00 pm: Worked with Jeemin to generate an entire list of potential US accelerators *Pulled data from crunchbaseSDC for Ed and normalized it. Worked to find a way Learned how to classify accelerators just based on their descriptionsuse SDC and the normalizer.
42/2124/2017: 1:00-43:00 pm: Continued working through *Finished cleaning up the list identifying accelerators that we do not have. Ramee and Juliette are now helping us gather cohort data for those missing acceleratorsY-combinator on the Final Cohort Excel Spreadsheet.
42/2427/2017 91:00-15:00 pm: Updated Veeral on current state of project. Typed *Continued cleaning up a to-do list on the discussion wiki for Veeral. Got new cohort data on an accelerator and added it to in the Excel file. Finished Cohort Number and Year.
53/31/2017 112:00-15:00 pm: Talked *Worked with Ben and Shrey to Ed pull data from SDC for all VC funded companies and Anne about future report. Continued working through list of crunchbase potential accelerators. Last day of work for this semesternormalized it to put it in an Excel document.
93/113/2017 21:00-52:00 30 pm: Spoke *Worked with Ben to Ed about try and repeat down the project VC data without it going forward. Organized the current updated data for our projecttoo far.
93/126/2017 31:00-54:00 pm: Began going through the Cleaned Cohort Data Excel file and found a few problems *Worked with it. Will continue the Shrey to finish cleaning process for the rest of cohort data. It is ready to be run through the weekmatcher with Ben.
93/138/2017 21:00-5:00 pm: Sorted through Cleaned *Matched the VC Data with the list of Cohort Data Companies and finalized our List of Accelerators. We can begin the process got one list of creating our PercentVC tableall cohort companies that have received VC funding.
93/1410/2017 312:00-52:00 pm: Completely finalized our dataset of accelerators and startups. Met with Michelle Passo to discuss objectives *Put a write-up on the top of the research for credit courseAccelerator wiki page detailing where we are in the project currently as well as what data we have accumulated on the RDP.
93/1820/2017 21:00-45:00 pm: Talked with Peter about *Began gathering the LinkedIn crawler dataURLs of all accelerators in a TextPad file called Accelerator URLs. Went through VC page that Meghana sent meParticipated in the SQL training session.
93/1922/2017 31:00-5:00 pm: Completed SDC pull of updated *Made tables in Terminal for Accelerator companies matched with VC companies and for Cohort Data.
93/2027/2017 21:00-54:00 pm: Attempted several times to run the Matcher. Cleaned our pulled data*Compiled all URLs of accelerator into a TextPad file.
93/2129/2017 31:00-5:00 pm: Came extremely close to running *Worked on the matched data with Ben. Next time I will run the Matcher RegEx code that will filter the correctly. Reviewed URLs, and I will look through the final LinkedIn data from Peterduplicates where two different VC backed company names matched to one cohort company name.
93/2531/2017 21:00-52:00 pm: Finalized *Ran the matched file of code for accelerator companies with VC portfolio companiesurls which are ready to be run through the wayback machine in order to get the start dates. Gave Ben the data on Georgia acceleratorsAlso began looking through vc backed company names.
94/263/2017 31:00-5:00 pm: Worked on finding the duplicates in our Matched file in order to have the most accurate data*Continued looking through double matched VC companies. Learned more SQL from Ed.
94/275/2017 21:00-5:00 pm: Attempted to find a way to organize *Made the duplicate matchesfinal vc percentage table on terminal and for next time I will collect missing accelerator data.
94/287/2017 41:00-53:00 pm: Continued running through matched *Began collecting cohort data for big accelerators that were missing from our list in order to organize add it effectivelyto our final list of cohort companies.
4/10/2/2017 21:00-5:00 pm: Talked to Ed about next steps *Finished gathering cohort company names for big accelerators that we were missing and put them into the projectCleaned Cohort Companies Excel file. Practiced accessing the crunchbase database on SQL. Brushed up on SQL codeBen is looking through Crunchbase data in order to possibly find more missing accelerators.
104/314/2017 31:00-54:00 pm: Searched *Began working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes on the database for crunchbase investment informationones that I was able to go through. Need to finish this textpad before moving forward.
104/417/2017 21:00-54:00 pm: Pulled the funding rounds table *Continued going through potential Crunchbase accelerators that we may have missed. Talked to Ed about getting a more comprehensive list from SQL Excel file and matched it with by the end of the companies that semester have received VC funding in order to gather round datesthe tables and data collected and done.
104/619/2017 31:00-54:00 pm: Went through the matched data*Worked with Jeemin to generate an entire list of potential US accelerators from crunchbase. Brainstormed ways Worked to find a way to get the dates for cohort companies going through classify acceleratorsjust based on their descriptions.
104/1121/2017 2: 1:00-34:30 00 pm: Looked into using *Continued working through the WhoIs Parser in order to find when the companies went through their list identifying accelerators that we do not have. Ramee and Juliette are now helping us gather cohort data for those missing accelerators.
104/1224/2017 39:00-51:00 pm: Discovered that the Wayback Machine will not be a good option for finding when companies went through their accelerators*Updated Veeral on current state of project. Created Typed up a to-do list of VCCompanies and their earliest round dateon the discussion wiki for Veeral. Included a column for the date they went through their accelerators Got new cohort data on an accelerator and will fill added it in when we find a good method of finding this dateto Excel file.
105/163/2017 211:00-31:30 00 pm: *Talked to Ed and Anne about future report. Continued working on sorting VCCompanies by their earliest round datethrough list of crunchbase potential accelerators. Last day of work for this semester.
10/17/2017 3:00-5:00 pm: Worked with Ben to find a solution to our problem of data acquisition. Finalized earliest round date for VCCompanies.===Fall 2016===
10/1817/2017 2016 2:00-5:00 pm: Updated our VC data with Ed's help in order *Created personal wiki page as well as work log; Read about the research project to increase the accuracy which I have been assigned; Wrote a short summary of what I believe it is and completion of our data.included some helpful links
10/1918/2017 32016 4:00-56:00 pm: Organized all *Met with research partner Shrey who filled me in on where we are with the project; Began looking on websites of our matched data certain accelerators for how to determine their cohorts and updated it in order to reflect listed these steps on the most recent SDC pull with Ed. Matched Crunchbase data with our cohort companies.wiki
10/2019/2017 2016 2:00-35:30 00 pm: Generated *Finished looking on the remaining accelerator websites and wrote the steps on determining how to manually locate the new list of VCCompanies as well as their earliest round datescohorts.
10/2320/2017 22016 4:00-36:30 00 pm: Worked on sorting out *Met with Peter and Christy to discuss the discrepancies in our matched possibility of creating a web crawler that will pull datafrom individual accelerator sites.
10/24/2017 32016 2:00-5:00 pm: Went through list of VCCompanies *Brainstormed with Albert and began adding respective accelerators in order Julia about changes to the category name for SBDE. Spoke to proceed with VCPercentage tableEd about full scope of accelerator project.
10/25/2017 22016 4:00-56:00 pm: Continued going through list *Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables to search for in terms of VCCompanies and adding accelerators, startups, cohorts, etc.
10/26/2017 2016 2:00-5:00 pm*Began searching for more databases including lists of accelerators as well as some characteristics of those accelerators; Began searching for characteristics that identify accelerators on their websites 10/27/2016 4:00-6:00 pm*Continued searching for relevant lists of accelerators to include on our page. Added some links that have high potential under the tab (Obtained from List of Accelerators or various Google searches). 10/31/2016 2:00-5:00 pm*Began constructing a list of variables that clearly distinguish an accelerator on its website. This is in an effort to allow a crawler to crawl through many Google searches and identify accelerators. 11/1/2016 4:00-6:00 pm*Continued looking for variables that could identify accelerators from their websites. Searched through numerous different websites of accelerators obtained from our current databases. 11/2/2016 2:00-4:00 pm*Continued combing through websites of numerous accelerators, well-known and other, in the hopes of finding identifying variables. 11/3/2016 4:00-6:00 pm*Finalized my list of variables that could be used to distinguish the websites of accelerators. Slightly re-arranged our list of accelerator databases in order of relevance. 11/7/2016 2:00-5:00 pm*Began compiling the list of all accelerators. Created a new TextPad document with information from a new database. 11/8/2016 4:00-6:00 pm*Worked with Shrey and Ben in order to compile all of our accelerator databases into one long list on Textpad. 11/9/2016 2:00-5:3000 pm*Continued formulating a database for all accelerators and all of the available info given. 11/10/2016 4:00-6:00 pm*Worked with Shrey and Peter in order to develop a crawler for f6s. 11/14/2016 2:00-5:00 pm*Began sorting the Seed-DB database in an Excel document. 11/15/2016 4:00-6:00 pm*Conducted some Google searches in an attempt to find more accelerator databases. Began looking through Executive Orders searching for keywords. 11/16/2016 2:00-5:30 00 pm*Completed searching through Executive Orders. 11/17/2016 4:00-6: 00 pm*Continued going working on Google searches for state accelerator list. Looked through f6s for common words that can be used to distinguish accelerators once we have finalized the crawler. 11/21/2016 2:00-5:00 pm*Randomly chose 10 accelerators from Excel list of VCCompanies accelerators on the RDP. Went through each website and adding listed the steps that I took in order to determine whether or not the website belonged to an accelerator. Will continue extracting cohort information tomorrow. 11/22/2016 4:00-6:00 pm*Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. Will have this completed Worked with Peter in order to build a tool that will search all of the HTMLs and attempt to identify each one as an accelerator as well as extract some basic information. 11/28/2016 2:00-5:00 pm*Merged the F6S accelerator list with our other list, then posted it on Mondaythe project page. Learned process for accelerator data extraction from Ed. 11/29/2016 4:00-6:00 pm*Began process of collecting data from the 20 accelerators that I am responsible for. 11/30/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished 15/20. 12/1/2016 4:00-6:00 pm*Continued collecting data from accelerators. Finished original 20, picked up a new set of 20. 12/2/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished next 20. 12/8/2016 1:00-3:00 pm*Completed collecting data from accelerators for the semester.
[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]]
[[Category:Work Log]]

Navigation menu