Changes

Jump to navigation Jump to search
no edit summary
10/17/2016 2:00-5:00 pm: Created personal wiki page as well as work log; Read about the research project to which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links===Fall 2017===<onlyinclude>
10[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]] 9/1811/2017 2:00-5:00 pm*Spoke to Ed about the project going forward. Organized the current updated data for our project. 9/12/2017 3:00-5:00 pm*Began going through the Cleaned Cohort Data Excel file and found a few problems with it. Will continue the cleaning process for the rest of the week. 9/13/2016 42017 2:00-65:00 pm*Sorted through Cleaned Cohort Data and finalized our List of Accelerators. We can begin the process of creating our PercentVC table. 9/14/2017 3: 00-5:00 pm*Completely finalized our dataset of accelerators and startups. Met with Michelle Passo to discuss objectives of the research partner Shrey who filled me in on where we are for credit course. 9/18/2017 2:00-4:00 pm*Talked with Peter about the project; Began looking on websites LinkedIn crawler data. Went through VC page that Meghana sent me. 9/19/2017 3:00-5:00 pm*Completed SDC pull of certain accelerators for how to determine their cohorts and listed these steps on the wikiupdated VC Data.
109/1920/2016 2017 2:00-5:00 pm: Finished looking on the remaining accelerator websites and wrote the steps on determining how *Attempted several times to manually locate run the cohortsMatcher. Cleaned our pulled data.
109/2021/2016 42017 3:00-65:00 pm: Met with Peter and Christy *Came extremely close to discuss running the possibility of creating a web crawler that will pull Matcher the correctly. Reviewed the final LinkedIn data from individual accelerator sitesPeter.
109/2425/2016 2017 2:00-5:00 pm: Brainstormed with Albert and Julia about changes to *Finalized the category name for SBDE. Spoke to Ed about full scope matched file of accelerator projectcompanies with VC portfolio companies. Gave Ben the data on Georgia accelerators.
109/2526/2016 42017 3:00-65:00 pm: Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables *Worked on finding the duplicates in our Matched file in order to search for in terms of accelerators, startups, cohorts, etchave the most accurate data.
109/2627/2016 2017 2:00-5:00 pm: Began searching for more databases including lists of accelerators as well as some characteristics of those accelerators; Began searching for characteristics that identify accelerators on their websites*Attempted to find a way to organize the duplicate matches.
109/2728/2016 2017 4:00-65:00 pm: *Continued searching for relevant lists of accelerators running through matched data in order to include on our page. Added some links that have high potential under the tab (Obtained from List of Accelerators or various Google searches)organize it effectively.
10/312/2016 2017 2:00-5:00 pm: Began constructing a list of variables that clearly distinguish an accelerator *Talked to Ed about next steps for the project. Practiced accessing the crunchbase database on its websiteSQL. This is in an effort to allow a crawler to crawl through many Google searches and identify acceleratorsBrushed up on SQL code.
1110/13/2016 42017 3:00-65:00 pm: Continued looking *Searched the database for variables that could identify accelerators from their websites. Searched through numerous different websites of accelerators obtained from our current databasescrunchbase investment information.
1110/24/2016 2017 2:00-45:00 pm: Continued combing through websites of numerous accelerators, well-known *Pulled the funding rounds table from SQL and other, matched it with the companies that have received VC funding in the hopes of finding identifying variablesorder to gather round dates.
1110/6/2017 3/2016 4:00-65:00 pm: Finalized my list of variables that could be used *Went through the matched data. Brainstormed ways to distinguish get the websites of dates for cohort companies going through accelerators. Slightly re-arranged our list of accelerator databases in order of relevance.
10/11/7/2016 2017 2:00-53:00 30 pm: Began compiling *Looked into using the WhoIs Parser in order to find when the list of all companies went through their accelerators. Created a new TextPad document with information from a new database.
1110/812/2016 42017 3:00-65:00 pm: Worked with Shrey *Discovered that the Wayback Machine will not be a good option for finding when companies went through their accelerators. Created a list of VCCompanies and their earliest round date. Included a column for the date they went through their accelerators and Ben will fill it in order to compile all when we find a good method of our accelerator databases into one long list on Textpadfinding this date.
1110/916/2016 2017 2:00-53:00 30 pm: *Continued formulating a database for all accelerators and all of the available info givenworking on sorting VCCompanies by their earliest round date.
1110/1017/2016 42017 3:00-65:00 pm: *Worked with Shrey and Peter in order Ben to develop find a crawler solution to our problem of data acquisition. Finalized earliest round date for f6sVCCompanies.
1110/1418/2016 2017 2:00-5:00 pm: Began sorting *Updated our VC data with Ed's help in order to increase the Seed-DB database in an Excel documentaccuracy and completion of our data.
1110/1519/2016 42017 3:00-65:00 pm: Conducted some Google searches *Organized all of our matched data and updated it in an attempt order to find more accelerator databasesreflect the most recent SDC pull with Ed. Began looking through Executive Orders searching for keywordsMatched Crunchbase data with our cohort companies.
1110/1620/2016 2017 2:00-53:00 30 pm: Completed searching through Executive Orders*Generated the new list of VCCompanies as well as their earliest round dates.
1110/1723/2016 42017 2:00-63:00 30 pm: Continued working *Worked on Google searches for state accelerator list. Looked through f6s for common words that can be used to distinguish accelerators once we have finalized sorting out the crawlerdiscrepancies in our matched data.
1110/2124/2016 22017 3:00-5:00 pm: Randomly chose 10 accelerators from Excel *Went through list of VCCompanies and began adding respective accelerators on the RDP. Went through each website and listed the steps that I took in order to determine whether or not the website belonged to an accelerator. Will continue extracting cohort information tomorrowproceed with VCPercentage table.
1110/2225/2016 42017 2:00-65:00 pm: Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. Worked with Peter in order to build a tool that will search all *Continued going through list of the HTMLs VCCompanies and attempt to identify each one as an accelerator as well as extract some basic informationadding accelerators.
1110/2826/2016 22017 3:0030-5:00 30 pm: Merged the F6S accelerator *Continued going through list with our other list, then posted it of VCCompanies and adding accelerators. Will have this completed on the project page. Learned process for accelerator data extraction from EdMonday.
1110/2930/2016 42017 2:00-63:00 30 pm: Began process *Finished adding all of the accelerators to the list of collecting data from VCCompanies. Added a column indicating whether or not the 20 company went through two or more accelerators that I am responsible for.
1110/3031/2016 22017 3:00-5:00 pm: Continued collecting *Began compiling data from accelerators. Finished 15/20in the column for Date Company went through Accelerator.
1211/1/2016 42017 2:00-64:00 pm: Continued collecting data from accelerators. Finished original 20, picked up a new set of 20*Finalized entering dates for Y Combinator cohort companies.
1211/2/2016 22017 4:00-5:00 30 pm: *Continued collecting data from accelerators. Finished next 20entering cohort company dates into Excel file.
1211/86/2016 12017 2:00-34:00 pm: Completed collecting data from accelerators *Continued entering cohort company dates into Excel file. Began compiling a list of keywords for the semesterdemo day press releases.
111/187/2017 13:00-5:00 pm: Continued collecting data *Finished coming up with keywords for accelerator projectdemo day crawler. Helped Catherine draft tweets for Sent the McNair Center twitter accountfinal list to Peter.
111/208/2017 12:00-3:00 30 pm: Continued collecting *Spoke to Ed and organized all of our current data on accelerators. Attended McNair Center team meeting.
111/239/2017 13:00-5:00 pm: Began combing through accelerator list, determining which accelerators are still missing data *Created a new project page called Accelerator Data and documenting these in a TextPad file. Finished through #115listed all relevant files as well as descriptions.
111/2514/2017 13:00-5:00 pm: Continued looking through accelerator list*Looked up URLs and decided whether or not the webiste was relevant.
111/2715/2017 12:00-35:00 pm: Continued going through accelerator list. Left off on #226 with Shrey*Created SQL database entitled "acceleratordata" and began creating tables from folder of All Relevant Files.
111/2016/2017 13:00-5:00 pm: *Continued going through accelerator list. Finished through #440to input tables into SQL database.
211/120/2017 12:00-5:00 pm: Finished going through the list of accelerators looking for incomplete files. Began completing the *Cleaned text files that were not donein order to import tables into SQL database.
211/327/2017 12:00-35:00 pm: Continued working *Worked with Peter to find and exclude irrelevant keywords on completing accelerator filesHTML pages. Began categorizing relevant demo day pages.
211/628/2017 13:00-45:30 00 pm: *Finished data set inputting tables of accelerators. Began going through and making sure that all text files and cohort relevant files are of the same format so Peter can easily pull the information. Left for 30 minutes for an interview from 2:30-3:00 pminto SQL database.
211/829/2017 12:00-5:00 pm: Finished formatting *Went through #137accelerator HTML URLs. Spoke with Ed about projectgoing through HTMLs and classifying based on overall and specific relevance.
212/131/2017 13:00-5:00 pm: Completed formatting for all *Worked through accelerator text fileslinks and classified pages based on whether or not they provided relevant information about startup timing.
212/154/2017 310:00-512:00 pm: Made copy of the completed data set. Spoke to Ed about future steps to take for project including gathering founder data *Continued running through demo day crawl URLs and obtaining the crunchbase apiscoring them based on relevance.
212/177/2017 1:00-34:00 30 pm: Went through final Excel spreadsheet *Finalized scoring of demo day URLs for cohort informationthe original crawl. Still need to run the crawler one more time after the completion Last day of the editing process. Found the application work for the crunchbase api which will hopefully allow us to gain accessthis semester.
2</20/2017 1:00-5:00 pm: Filled out another application for Crunchbase research access; Found the first source for the incubator project on angel.co, will hopefully work with Peter to make a crawler similar to f6sonlyinclude>
2/22/===Spring 2017 1:00-5:00 pm: Pulled data from SDC for Ed and normalized it. Learned how to use SDC and the normalizer.===
21/2418/2017 1:00-35:00 pm: Finished cleaning up the cohort *Continued collecting data for Y-combinator on accelerator project. Helped Catherine draft tweets for the Final Cohort Excel SpreadsheetMcNair Center twitter account.
21/2720/2017 1:00-53:00 pm: *Continued cleaning up the cohort collecting data in the Excel fileon accelerators. Finished Cohort Number and YearAttended McNair Center team meeting.
31/123/2017 21:00-5:00 pm: Worked with Ben and Shrey to pull *Began combing through accelerator list, determining which accelerators are still missing data from SDC for all VC funded companies and normalized it to put it documenting these in an Excel documenta TextPad file. Finished through #115.
31/325/2017 1:00-25:30 00 pm: Worked with Ben to try and repeat down the VC data without it going too far*Continued looking through accelerator list.
31/627/2017 1:00-43:00 pm: Worked *Continued going through accelerator list. Left off on #226 with Shrey to finish cleaning the cohort data. It is ready to be run through the matcher with Ben.
31/820/2017 1:00-5:00 pm: Matched the VC Data with the list of Cohort Companies and got one *Continued going through accelerator list of all cohort companies that have received VC funding. Finished through #440.
32/101/2017 121:00-25:00 pm: Put a write-up on *Finished going through the top list of accelerators looking for incomplete files. Began completing the Accelerator wiki page detailing where we are in the project currently as well as what data we have accumulated on the RDPfiles that were not done.
2/3/20/2017 1:00-53:00 pm: Began gathering the URLs of all accelerators in a TextPad file called Accelerator URLs. Participated in the SQL training session*Continued working on completing accelerator files.
32/226/2017 1:00-54:00 30 pm: Made tables in Terminal *Finished data set of accelerators. Began going through and making sure that all text files and cohort files are of the same format so Peter can easily pull the information. Left for Accelerator companies matched with VC companies and 30 minutes for Cohort Dataan interview from 2:30-3:00 pm.
32/278/2017 1:00-45:00 pm: Compiled all URLs of accelerator into a TextPad file*Finished formatting through #137. Spoke with Ed about project.
32/2913/2017 1:00-5:00 pm: Worked on the matched data with Ben. Next time I will run the RegEx code that will filter the URLs, and I will look through the duplicates where two different VC backed company names matched to one cohort company name*Completed formatting for all accelerator text files.
32/3115/2017 13:00-25:00 pm: Ran *Made copy of the code for accelerator urls which are ready completed data set. Spoke to be run through the wayback machine in order Ed about future steps to get take for project including gathering founder data and obtaining the start dates. Also began looking through vc backed company namescrunchbase api.
42/317/2017 1:00-53:00 pm: Continued looking *Went through double matched VC companiesfinal Excel spreadsheet for cohort information. Learned Still need to run the crawler one more SQL from Edtime after the completion of the editing process. Found the application for the crunchbase api which will hopefully allow us to gain access.
42/520/2017 1:00-5:00 pm: Made *Filled out another application for Crunchbase research access; Found the first source for the final vc percentage table incubator project on terminal and for next time I angel.co, will collect missing accelerator data.hopefully work with Peter to make a crawler similar to f6s
42/722/2017 1:00-35:00 pm: Began collecting cohort *Pulled data from SDC for big accelerators that were missing from our list in order to add Ed and normalized it . Learned how to our final list of cohort companiesuse SDC and the normalizer.
42/1024/2017 1:00-53:00 pm: *Finished gathering cleaning up the cohort company names data for big accelerators that we were missing and put them into Y-combinator on the Cleaned Final Cohort Companies Excel file. Ben is looking through Crunchbase data in order to possibly find more missing acceleratorsSpreadsheet.
42/1427/2017 1:00-45:00 pm: Began working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes on *Continued cleaning up the cohort data in the ones that I was able to go throughExcel file. Need to finish this textpad before moving forwardFinished Cohort Number and Year.
43/171/2017 12:00-45:00 pm: Continued going through potential Crunchbase accelerators that we may have missed. Talked *Worked with Ben and Shrey to Ed about getting a more comprehensive list pull data from SDC for all VC funded companies and normalized it to put it in an Excel file and by the end of the semester have the tables and data collected and donedocument.
43/193/2017 1:00-42:00 30 pm: *Worked with Jeemin to generate an entire list of potential US accelerators from crunchbase. Worked to find a way Ben to classify accelerators just based on their descriptionstry and repeat down the VC data without it going too far.
43/216/2017: 1:00-4:00 pm: Continued working through *Worked with Shrey to finish cleaning the list identifying accelerators that we do not have. Ramee and Juliette are now helping us gather cohort data for those missing accelerators. It is ready to be run through the matcher with Ben.
43/248/2017 91:00-15:00 pm: Updated Veeral on current state *Matched the VC Data with the list of project. Typed up a to-do Cohort Companies and got one list on the discussion wiki for Veeral. Got new of all cohort data on an accelerator and added it to Excel filecompanies that have received VC funding.
53/310/2017 1112:00-12:00 pm: Talked to Ed and Anne about future report. Continued working through list of crunchbase potential accelerators. Last day *Put a write-up on the top of work for this semesterthe Accelerator wiki page detailing where we are in the project currently as well as what data we have accumulated on the RDP.
93/1120/2017 21:00-5:00 pm: Spoke to Ed about *Began gathering the project going forwardURLs of all accelerators in a TextPad file called Accelerator URLs. Organized Participated in the current updated data for our projectSQL training session.
93/1222/2017 31:00-5:00 pm: Began going through the Cleaned *Made tables in Terminal for Accelerator companies matched with VC companies and for Cohort Data Excel file and found a few problems with it. Will continue the cleaning process for the rest of the week.
93/1327/2017 21:00-54:00 pm: Sorted through Cleaned Cohort Data and finalized our List of Accelerators. We can begin the process *Compiled all URLs of creating our PercentVC tableaccelerator into a TextPad file.
93/1429/2017 31:00-5:00 pm: Completely finalized our dataset of accelerators *Worked on the matched data with Ben. Next time I will run the RegEx code that will filter the URLs, and startups. Met with Michelle Passo I will look through the duplicates where two different VC backed company names matched to discuss objectives of the research for credit courseone cohort company name.
93/1831/2017 21:00-42:00 pm: Talked with Peter about *Ran the code for accelerator urls which are ready to be run through the wayback machine in order to get the LinkedIn crawler datastart dates. Went Also began looking through VC page that Meghana sent mevc backed company names.
94/193/2017 31:00-5:00 pm: Completed SDC pull of updated *Continued looking through double matched VC Datacompanies. Learned more SQL from Ed.
94/205/2017 21:00-5:00 pm: Attempted several times to run *Made the Matcher. Cleaned our pulled final vc percentage table on terminal and for next time I will collect missing accelerator data.
94/217/2017 31:00-53:00 pm: Came extremely close *Began collecting cohort data for big accelerators that were missing from our list in order to add it to running the Matcher the correctly. Reviewed the our final LinkedIn data from Peterlist of cohort companies.
94/2510/2017 21:00-5:00 pm: Finalized *Finished gathering cohort company names for big accelerators that we were missing and put them into the matched Cleaned Cohort Companies Excel file of accelerator companies with VC portfolio companies. Gave Ben the is looking through Crunchbase data on Georgia in order to possibly find more missing accelerators.
94/2614/2017 31:00-54:00 pm: Worked *Began working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes on finding the duplicates in our Matched file in order ones that I was able to have the most accurate datago through. Need to finish this textpad before moving forward.
94/2717/2017 21:00-54:00 pm: Attempted *Continued going through potential Crunchbase accelerators that we may have missed. Talked to find Ed about getting a way to organize more comprehensive list from Excel file and by the end of the semester have the duplicate matchestables and data collected and done.
94/2819/2017 41:00-54:00 pm: Continued running through matched data in order *Worked with Jeemin to generate an entire list of potential US accelerators from crunchbase. Worked to find a way to organize it effectivelyclassify accelerators just based on their descriptions.
104/221/2017 2: 1:00-54:00 pm: Talked to Ed about next steps for *Continued working through the projectlist identifying accelerators that we do not have. Practiced accessing the crunchbase database on SQL. Brushed up on SQL codeRamee and Juliette are now helping us gather cohort data for those missing accelerators.
104/324/2017 39:00-51:00 pm: Searched *Updated Veeral on current state of project. Typed up a to-do list on the database discussion wiki for crunchbase investment informationVeeral. Got new cohort data on an accelerator and added it to Excel file.
105/43/2017 211:00-51:00 pm: Pulled the funding rounds table from SQL *Talked to Ed and matched it with the companies that have received VC funding in order to gather round datesAnne about future report. Continued working through list of crunchbase potential accelerators. Last day of work for this semester.
10/6/2017 3:00-5:00 pm: Went through the matched data. Brainstormed ways to get the dates for cohort companies going through accelerators.===Fall 2016===
10/1117/2017 2016 2:00-35:30 00 pm: Looked into using *Created personal wiki page as well as work log; Read about the WhoIs Parser in order research project to find when the companies went through their accelerators.which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links
10/1218/2017 32016 4:00-56:00 pm: Discovered that *Met with research partner Shrey who filled me in on where we are with the Wayback Machine will not be a good option project; Began looking on websites of certain accelerators for finding when companies went through how to determine their accelerators. Created a list of VCCompanies cohorts and their earliest round date. Included a column for listed these steps on the date they went through their accelerators and will fill it in when we find a good method of finding this date.wiki
10/1619/2017 2016 2:00-35:30 00 pm: Continued working *Finished looking on the remaining accelerator websites and wrote the steps on sorting VCCompanies by their earliest round datedetermining how to manually locate the cohorts.
10/1720/2017 32016 4:00-56:00 pm: Worked *Met with Ben Peter and Christy to find discuss the possibility of creating a solution to our problem of web crawler that will pull data acquisition. Finalized earliest round date for VCCompaniesfrom individual accelerator sites.
10/1824/2017 2016 2:00-5:00 pm: Updated our VC data *Brainstormed with Ed's help in order Albert and Julia about changes to increase the accuracy and completion category name for SBDE. Spoke to Ed about full scope of our dataaccelerator project.
10/1925/2017 32016 4:00-56:00 pm: Organized all *Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables to search for in terms of our matched data and updated it in order to reflect the most recent SDC pull with Ed. Matched Crunchbase data with our cohort companiesaccelerators, startups, cohorts, etc.
10/2026/2016 2:00-5:00 pm*Began searching for more databases including lists of accelerators as well as some characteristics of those accelerators; Began searching for characteristics that identify accelerators on their websites 10/27/2016 4:00-6:00 pm*Continued searching for relevant lists of accelerators to include on our page. Added some links that have high potential under the tab (Obtained from List of Accelerators or various Google searches). 10/31/2016 2:00-5:00 pm*Began constructing a list of variables that clearly distinguish an accelerator on its website. This is in an effort to allow a crawler to crawl through many Google searches and identify accelerators. 11/1/2016 4:00-6:00 pm*Continued looking for variables that could identify accelerators from their websites. Searched through numerous different websites of accelerators obtained from our current databases. 11/2/2017 2016 2:00-4:00 pm*Continued combing through websites of numerous accelerators, well-known and other, in the hopes of finding identifying variables. 11/3/2016 4:00-6:30 00 pm*Finalized my list of variables that could be used to distinguish the websites of accelerators. Slightly re-arranged our list of accelerator databases in order of relevance. 11/7/2016 2: Generated 00-5:00 pm*Began compiling the list of all accelerators. Created a new TextPad document with information from a new database. 11/8/2016 4:00-6:00 pm*Worked with Shrey and Ben in order to compile all of our accelerator databases into one long list on Textpad. 11/9/2016 2:00-5:00 pm*Continued formulating a database for all accelerators and all of the available info given. 11/10/2016 4:00-6:00 pm*Worked with Shrey and Peter in order to develop a crawler for f6s. 11/14/2016 2:00-5:00 pm*Began sorting the Seed-DB database in an Excel document. 11/15/2016 4:00-6:00 pm*Conducted some Google searches in an attempt to find more accelerator databases. Began looking through Executive Orders searching for keywords. 11/16/2016 2:00-5:00 pm*Completed searching through Executive Orders. 11/17/2016 4:00-6:00 pm*Continued working on Google searches for state accelerator list . Looked through f6s for common words that can be used to distinguish accelerators once we have finalized the crawler. 11/21/2016 2:00-5:00 pm*Randomly chose 10 accelerators from Excel list of accelerators on the RDP. Went through each website and listed the steps that I took in order to determine whether or not the website belonged to an accelerator. Will continue extracting cohort information tomorrow. 11/22/2016 4:00-6:00 pm*Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. Worked with Peter in order to build a tool that will search all of VCCompanies the HTMLs and attempt to identify each one as an accelerator as well as their earliest round datesextract some basic information. 11/28/2016 2:00-5:00 pm*Merged the F6S accelerator list with our other list, then posted it on the project page. Learned process for accelerator data extraction from Ed. 11/29/2016 4:00-6:00 pm*Began process of collecting data from the 20 accelerators that I am responsible for. 11/30/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished 15/20. 12/1/2016 4:00-6:00 pm*Continued collecting data from accelerators. Finished original 20, picked up a new set of 20. 12/2/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished next 20. 12/8/2016 1:00-3:00 pm*Completed collecting data from accelerators for the semester.
[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]]
[[Category:Work Log]]

Navigation menu