Changes

Jump to navigation Jump to search
no edit summary
10/17/2016 2:00-5:00 pm: Created personal wiki page as well as work log; Read about the research project to which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links===Fall 2017===<onlyinclude>
10[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]] 9/1811/2016 42017 2:00-65:00 pm*Spoke to Ed about the project going forward. Organized the current updated data for our project. 9/12/2017 3: Met with research partner Shrey who filled me in on where we are 00-5:00 pm*Began going through the Cleaned Cohort Data Excel file and found a few problems with it. Will continue the cleaning process for the project; Began looking on websites rest of certain accelerators for how to determine their cohorts the week. 9/13/2017 2:00-5:00 pm*Sorted through Cleaned Cohort Data and listed these steps on finalized our List of Accelerators. We can begin the wikiprocess of creating our PercentVC table.
109/1914/2016 22017 3:00-5:00 pm: Finished looking on the remaining accelerator websites *Completely finalized our dataset of accelerators and wrote the steps on determining how startups. Met with Michelle Passo to manually locate discuss objectives of the cohortsresearch for credit course.
109/2018/2016 42017 2:00-64:00 pm: Met *Talked with Peter and Christy to discuss about the possibility of creating a web LinkedIn crawler data. Went through VC page that will pull data from individual accelerator sitesMeghana sent me.
109/2419/2016 22017 3:00-5:00 pm: Brainstormed with Albert and Julia about changes to the category name for SBDE. Spoke to Ed about full scope *Completed SDC pull of accelerator projectupdated VC Data.
109/2520/2016 42017 2:00-65:00 pm: Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables *Attempted several times to search for in terms of accelerators, startups, cohorts, etcrun the Matcher. Cleaned our pulled data.
109/2621/2016 22017 3:00-5:00 pm: Began searching for more databases including lists of accelerators as well as some characteristics of those accelerators; Began searching for characteristics that identify accelerators on their websites*Came extremely close to running the Matcher the correctly. Reviewed the final LinkedIn data from Peter.
109/2725/2016 42017 2:00-65:00 pm: Continued searching for relevant lists *Finalized the matched file of accelerators to include on our pageaccelerator companies with VC portfolio companies. Added some links that have high potential under Gave Ben the tab (Obtained from List of Accelerators or various Google searches)data on Georgia accelerators.
109/3126/2016 22017 3:00-5:00 pm: Began constructing a list of variables that clearly distinguish an accelerator *Worked on its website. This is finding the duplicates in an effort to allow a crawler our Matched file in order to crawl through many Google searches and identify acceleratorshave the most accurate data.
119/127/2016 42017 2:00-65:00 pm: Continued looking for variables that could identify accelerators from their websites. Searched through numerous different websites of accelerators obtained from our current databases*Attempted to find a way to organize the duplicate matches.
119/228/2016 22017 4:00-45:00 pm: *Continued combing running through websites of numerous accelerators, well-known and other, matched data in the hopes of finding identifying variablesorder to organize it effectively.
1110/32/2016 42017 2:00-65:00 pm: Finalized my list of variables that could be used *Talked to distinguish Ed about next steps for the project. Practiced accessing the websites of acceleratorscrunchbase database on SQL. Slightly re-arranged our list of accelerator databases in order of relevanceBrushed up on SQL code.
1110/73/2016 22017 3:00-5:00 pm: Began compiling *Searched the list of all accelerators. Created a new TextPad document with database for crunchbase investment information from a new database.
1110/84/2016 42017 2:00-65:00 pm: Worked *Pulled the funding rounds table from SQL and matched it with Shrey and Ben the companies that have received VC funding in order to compile all of our accelerator databases into one long list on Textpadgather round dates.
1110/96/2016 22017 3:00-5:00 pm: Continued formulating a database *Went through the matched data. Brainstormed ways to get the dates for all cohort companies going through accelerators and all of the available info given.
10/11/10/2016 42017 2:00-63:00 30 pm: Worked with Shrey and Peter *Looked into using the WhoIs Parser in order to develop a crawler for f6sfind when the companies went through their accelerators.
1110/1412/2016 22017 3:00-5:00 pm: Began sorting *Discovered that the Wayback Machine will not be a good option for finding when companies went through their accelerators. Created a list of VCCompanies and their earliest round date. Included a column for the Seed-DB database date they went through their accelerators and will fill it in an Excel documentwhen we find a good method of finding this date.
1110/1516/2016 42017 2:00-63:00 30 pm: Conducted some Google searches in an attempt to find more accelerator databases. Began looking through Executive Orders searching for keywords*Continued working on sorting VCCompanies by their earliest round date.
1110/1617/2016 22017 3:00-5:00 pm: Completed searching through Executive Orders*Worked with Ben to find a solution to our problem of data acquisition. Finalized earliest round date for VCCompanies.
1110/1718/2016 42017 2:00-65:00 pm: Continued working on Google searches for state accelerator list. Looked through f6s for common words that can be used *Updated our VC data with Ed's help in order to distinguish accelerators once we have finalized increase the crawleraccuracy and completion of our data.
1110/2119/2016 22017 3:00-5:00 pm: Randomly chose 10 accelerators from Excel list *Organized all of accelerators on the RDP. Went through each website our matched data and listed the steps that I took updated it in order to determine whether or not reflect the website belonged to an acceleratormost recent SDC pull with Ed. Will continue extracting Matched Crunchbase data with our cohort information tomorrowcompanies.
1110/2220/2016 42017 2:00-63:00 30 pm: Listed out all steps for extracting cohort information from *Generated the ten randomly chosen accelerators. Worked with Peter in order to build a tool that will search all new list of the HTMLs and attempt to identify each one as an accelerator VCCompanies as well as extract some basic informationtheir earliest round dates.
1110/2823/2016 2017 2:00-53:00 30 pm: Merged *Worked on sorting out the F6S accelerator list with discrepancies in our other list, then posted it on the project page. Learned process for accelerator matched data extraction from Ed.
1110/2924/2016 42017 3:00-65:00 pm: Began process *Went through list of collecting data from the 20 VCCompanies and began adding respective accelerators that I am responsible forin order to proceed with VCPercentage table.
1110/3025/2016 2017 2:00-5:00 pm: *Continued collecting data from going through list of VCCompanies and adding accelerators. Finished 15/20.
1210/126/2016 42017 3:0030-65:00 30 pm: *Continued collecting data from going through list of VCCompanies and adding accelerators. Finished original 20, picked up a new set of 20Will have this completed on Monday.
1210/230/2016 2017 2:00-53:00 30 pm: Continued collecting data from *Finished adding all of the acceleratorsto the list of VCCompanies. Finished next 20Added a column indicating whether or not the company went through two or more accelerators.
1210/831/2016 12017 3:00-35:00 pm: Completed collecting *Began compiling data from accelerators in the column for the semesterDate Company went through Accelerator.
11/1/18/2017 12:00-54:00 pm: Continued collecting data for accelerator project. Helped Catherine draft tweets *Finalized entering dates for the McNair Center twitter accountY Combinator cohort companies.
111/202/2017 14:00-35:00 30 pm: *Continued collecting data on accelerators. Attended McNair Center team meetingentering cohort company dates into Excel file.
111/236/2017 12:00-54:00 pm: *Continued entering cohort company dates into Excel file. Began combing through accelerator compiling a list, determining which accelerators are still missing data and documenting these in a TextPad file. Finished through #115of keywords for demo day press releases.
111/257/2017 13:00-5:00 pm: Continued looking through accelerator *Finished coming up with keywords for demo day crawler. Sent the final listto Peter.
111/278/2017 12:00-3:00 30 pm: Continued going through accelerator list. Left off on #226 with Shrey*Spoke to Ed and organized all of our current data.
111/209/2017 13:00-5:00 pm: Continued going through accelerator list. Finished through #440*Created a new project page called Accelerator Data and listed all relevant files as well as descriptions.
211/114/2017 13:00-5:00 pm: Finished going through the list of accelerators looking for incomplete files. Began completing *Looked up URLs and decided whether or not the files that were not donewebiste was relevant.
211/315/2017 12:00-35:00 pm: Continued working on completing accelerator files*Created SQL database entitled "acceleratordata" and began creating tables from folder of All Relevant Files.
211/616/2017 13:00-4:30 pm: Finished data set of accelerators. Began going through and making sure that all text files and cohort files are of the same format so Peter can easily pull the information. Left for 30 minutes for an interview from 2:30-35:00 pm*Continued to input tables into SQL database.
211/820/2017 12:00-5:00 pm: Finished formatting through #137. Spoke with Ed about project*Cleaned text files in order to import tables into SQL database.
211/1327/2017 12:00-5:00 pm: Completed formatting for all accelerator text files*Worked with Peter to find and exclude irrelevant keywords on HTML pages. Began categorizing relevant demo day pages.
211/1528/2017 3:00-5:00 pm: Made copy *Finished inputting tables of the completed data set. Spoke to Ed about future steps to take for project including gathering founder data and obtaining the crunchbase apirelevant files into SQL database.
211/1729/2017 12:00-35:00 pm: *Went through final Excel spreadsheet for cohort informationaccelerator HTML URLs. Still need to run the crawler one more time after the completion of the editing process. Found the application for the crunchbase api which will hopefully allow us to gain accessSpoke with Ed about going through HTMLs and classifying based on overall and specific relevance.
212/201/2017 13:00-5:00 pm: Filled out another application for Crunchbase research access; Found the first source for the incubator project *Worked through accelerator links and classified pages based on angelwhether or not they provided relevant information about startup timing.co, will hopefully work with Peter to make a crawler similar to f6s
212/224/2017 110:00-512:00 pm: Pulled data from SDC for Ed and normalized it. Learned how to use SDC *Continued running through demo day crawl URLs and the normalizerscoring them based on relevance.
212/247/2017 1:00-34:00 30 pm: Finished cleaning up *Finalized scoring of demo day URLs for the cohort data original crawl. Last day of work for Y-combinator on the Final Cohort Excel Spreadsheetthis semester.
2</27/2017 1:00-5:00 pm: Continued cleaning up the cohort data in the Excel file. Finished Cohort Number and Year.onlyinclude>
3/1/===Spring 2017 2:00-5:00 pm: Worked with Ben and Shrey to pull data from SDC for all VC funded companies and normalized it to put it in an Excel document.===
31/318/2017 1:00-25:30 00 pm: Worked with Ben to try and repeat down *Continued collecting data for accelerator project. Helped Catherine draft tweets for the VC data without it going too farMcNair Center twitter account.
31/620/2017 1:00-43:00 pm: Worked with Shrey to finish cleaning the cohort *Continued collecting dataon accelerators. It is ready to be run through the matcher with BenAttended McNair Center team meeting.
31/823/2017 1:00-5:00 pm: Matched the VC Data with the *Began combing through accelerator list of Cohort Companies , determining which accelerators are still missing data and got one list of all cohort companies that have received VC fundingdocumenting these in a TextPad file. Finished through #115.
31/1025/2017 121:00-25:00 pm: Put a write-up on the top of the Accelerator wiki page detailing where we are in the project currently as well as what data we have accumulated on the RDP*Continued looking through accelerator list.
31/2027/2017 1:00-53:00 pm: Began gathering the URLs of all accelerators in a TextPad file called Accelerator URLs*Continued going through accelerator list. Participated in the SQL training sessionLeft off on #226 with Shrey.
31/2220/2017 1:00-5:00 pm: Made tables in Terminal for Accelerator companies matched with VC companies and for Cohort Data*Continued going through accelerator list. Finished through #440.
32/271/2017 1:00-45:00 pm: Compiled all URLs *Finished going through the list of accelerator into a TextPad fileaccelerators looking for incomplete files. Began completing the files that were not done.
2/3/29/2017 1:00-53:00 pm: Worked *Continued working on the matched data with Ben. Next time I will run the RegEx code that will filter the URLs, and I will look through the duplicates where two different VC backed company names matched to one cohort company namecompleting accelerator files.
32/316/2017 1:00-24:00 30 pm: Ran the code for accelerator urls which *Finished data set of accelerators. Began going through and making sure that all text files and cohort files are ready to be run through of the wayback machine in order to get same format so Peter can easily pull the start datesinformation. Also began looking through vc backed company namesLeft for 30 minutes for an interview from 2:30-3:00 pm.
42/38/2017 1:00-5:00 pm: Continued looking *Finished formatting through double matched VC companies#137. Learned more SQL from Spoke with Edabout project.
42/513/2017 1:00-5:00 pm: Made the final vc percentage table on terminal and *Completed formatting for next time I will collect missing all accelerator datatext files.
42/715/2017 13:00-35:00 pm: Began collecting cohort *Made copy of the completed data for big accelerators that were missing from our list in order set. Spoke to add it Ed about future steps to our final list of cohort companiestake for project including gathering founder data and obtaining the crunchbase api.
42/1017/2017 1:00-53:00 pm: Finished gathering *Went through final Excel spreadsheet for cohort company names information. Still need to run the crawler one more time after the completion of the editing process. Found the application for big accelerators that we were missing and put them into the Cleaned Cohort Companies Excel file. Ben is looking through Crunchbase data in order crunchbase api which will hopefully allow us to possibly find more missing acceleratorsgain access.
42/1420/2017 1:00-45:00 pm: Began working through "*Filled out another application for Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes research access; Found the first source for the incubator project on the ones that I was able angel.co, will hopefully work with Peter to go through. Need make a crawler similar to finish this textpad before moving forward.f6s
42/1722/2017 1:00-45:00 pm: Continued going through potential Crunchbase accelerators that we may have missed*Pulled data from SDC for Ed and normalized it. Talked Learned how to Ed about getting a more comprehensive list from Excel file use SDC and by the end of the semester have the tables and data collected and donenormalizer.
42/1924/2017 1:00-43:00 pm: Worked with Jeemin to generate an entire list of potential US accelerators from crunchbase. Worked to find a way to classify accelerators just based *Finished cleaning up the cohort data for Y-combinator on their descriptionsthe Final Cohort Excel Spreadsheet.
42/2127/2017: 1:00-45:00 pm: *Continued working through cleaning up the list identifying accelerators that we do not havecohort data in the Excel file. Ramee Finished Cohort Number and Juliette are now helping us gather cohort data for those missing acceleratorsYear.
43/241/2017 92:00-15:00 pm: Updated Veeral on current state of project. Typed up a *Worked with Ben and Shrey to-do list on the discussion wiki pull data from SDC for Veeral. Got new cohort data on an accelerator all VC funded companies and added normalized it to put it in an Excel filedocument.
53/3/2017 111:00-12:00 30 pm: Talked *Worked with Ben to Ed try and Anne about future report. Continued working through list of crunchbase potential accelerators. Last day of work for this semesterrepeat down the VC data without it going too far.
93/116/2017 21:00-54:00 pm: Spoke *Worked with Shrey to Ed about finish cleaning the project going forwardcohort data. Organized It is ready to be run through the current updated data for our projectmatcher with Ben.
93/128/2017 31:00-5:00 pm: Began going through *Matched the Cleaned Cohort VC Data Excel file and found a few problems with it. Will continue the cleaning process for the rest list of Cohort Companies and got one list of the weekall cohort companies that have received VC funding.
93/1310/2017 212:00-52:00 pm: Sorted through Cleaned Cohort Data and finalized our List *Put a write-up on the top of Accelerators. We can begin the process of creating our PercentVC tableAccelerator wiki page detailing where we are in the project currently as well as what data we have accumulated on the RDP.
93/1420/2017 31:00-5:00 pm: Completely finalized our dataset *Began gathering the URLs of all accelerators and startupsin a TextPad file called Accelerator URLs. Met with Michelle Passo to discuss objectives of Participated in the research for credit courseSQL training session.
93/1822/2017 21:00-45:00 pm: Talked *Made tables in Terminal for Accelerator companies matched with Peter about the LinkedIn crawler data. Went through VC page that Meghana sent mecompanies and for Cohort Data.
93/1927/2017 31:00-54:00 pm: Completed SDC pull *Compiled all URLs of updated VC Dataaccelerator into a TextPad file.
93/2029/2017 21:00-5:00 pm: Attempted several times to *Worked on the matched data with Ben. Next time I will run the Matcher. Cleaned our pulled dataRegEx code that will filter the URLs, and I will look through the duplicates where two different VC backed company names matched to one cohort company name.
93/2131/2017 31:00-52:00 pm: Came extremely close *Ran the code for accelerator urls which are ready to running be run through the Matcher wayback machine in order to get the correctlystart dates. Reviewed the final LinkedIn data from PeterAlso began looking through vc backed company names.
94/253/2017 21:00-5:00 pm: Finalized the *Continued looking through double matched file of accelerator companies with VC portfolio companies. Gave Ben the data on Georgia acceleratorsLearned more SQL from Ed.
94/265/2017 31:00-5:00 pm: Worked *Made the final vc percentage table on finding the duplicates in our Matched file in order to have the most accurate terminal and for next time I will collect missing accelerator data.
4/7/2017 1:00-3:00 pm*Began collecting cohort data for big accelerators that were missing from our list in order to add it to our final list of cohort companies. 4/10/2017 1:00-5:00 pm*Finished gathering cohort company names for big accelerators that we were missing and put them into the Cleaned Cohort Companies Excel file. Ben is looking through Crunchbase data in order to possibly find more missing accelerators. 4/14/2017 1:00-4:00 pm*Began working through "Crunchbase Potential Accelerators" textpad that may contain missing accelerators and wrote notes on the ones that I was able to go through. Need to finish this textpad before moving forward. 4/17/2017 1:00-4:00 pm*Continued going through potential Crunchbase accelerators that we may have missed. Talked to Ed about getting a more comprehensive list from Excel file and by the end of the semester have the tables and data collected and done. 4/19/2017 1:00-4:00 pm*Worked with Jeemin to generate an entire list of potential US accelerators from crunchbase. Worked to find a way to classify accelerators just based on their descriptions. 4/21/2017: 1:00-4:00 pm*Continued working through the list identifying accelerators that we do not have. Ramee and Juliette are now helping us gather cohort data for those missing accelerators. 4/24/2017 9:00-1:00 pm*Updated Veeral on current state of project. Typed up a to-do list on the discussion wiki for Veeral. Got new cohort data on an accelerator and added it to Excel file. 5/3/2017 11:00-1:00 pm*Talked to Ed and Anne about future report. Continued working through list of crunchbase potential accelerators. Last day of work for this semester. ===Fall 2016=== 10/17/2016 2:00-5:00 pm*Created personal wiki page as well as work log; Read about the research project to which I have been assigned; Wrote a short summary of what I believe it is and included some helpful links 10/18/2016 4:00-6:00 pm*Met with research partner Shrey who filled me in on where we are with the project; Began looking on websites of certain accelerators for how to determine their cohorts and listed these steps on the wiki 10/19/2016 2:00-5:00 pm*Finished looking on the remaining accelerator websites and wrote the steps on determining how to manually locate the cohorts. 10/20/2016 4:00-6:00 pm*Met with Peter and Christy to discuss the possibility of creating a web crawler that will pull data from individual accelerator sites. 10/24/2016 2:00-5:00 pm*Brainstormed with Albert and Julia about changes to the category name for SBDE. Spoke to Ed about full scope of accelerator project. 10/25/2016 4:00-6:00 pm*Brainstormed with Shrey about different potential industry focuses within accelerators, as well as different variables to search for in terms of accelerators, startups, cohorts, etc. 10/26/2016 2:00-5:00 pm*Began searching for more databases including lists of accelerators as well as some characteristics of those accelerators; Began searching for characteristics that identify accelerators on their websites 10/27/2017 2016 4:00-6:00 pm*Continued searching for relevant lists of accelerators to include on our page. Added some links that have high potential under the tab (Obtained from List of Accelerators or various Google searches). 10/31/2016 2:00-5:00 pm*Began constructing a list of variables that clearly distinguish an accelerator on its website. This is in an effort to allow a crawler to crawl through many Google searches and identify accelerators. 11/1/2016 4:00-6:00 pm*Continued looking for variables that could identify accelerators from their websites. Searched through numerous different websites of accelerators obtained from our current databases. 11/2/2016 2:00-4:00 pm*Continued combing through websites of numerous accelerators, well-known and other, in the hopes of finding identifying variables. 11/3/2016 4:00-6:00 pm*Finalized my list of variables that could be used to distinguish the websites of accelerators. Slightly re-arranged our list of accelerator databases in order of relevance. 11/7/2016 2:00-5:00 pm*Began compiling the list of all accelerators. Created a new TextPad document with information from a new database. 11/8/2016 4:00-6:00 pm*Worked with Shrey and Ben in order to compile all of our accelerator databases into one long list on Textpad. 11/9/2016 2:00-5:00 pm*Continued formulating a database for all accelerators and all of the available info given. 11/10/2016 4:00-6:00 pm*Worked with Shrey and Peter in order to develop a crawler for f6s. 11/14/2016 2:00-5:00 pm*Began sorting the Seed-DB database in an Excel document. 11/15/2016 4: Attempted 00-6:00 pm*Conducted some Google searches in an attempt to find more accelerator databases. Began looking through Executive Orders searching for keywords. 11/16/2016 2:00-5:00 pm*Completed searching through Executive Orders. 11/17/2016 4:00-6:00 pm*Continued working on Google searches for state accelerator list. Looked through f6s for common words that can be used to distinguish accelerators once we have finalized the crawler. 11/21/2016 2:00-5:00 pm*Randomly chose 10 accelerators from Excel list of accelerators on the RDP. Went through each website and listed the steps that I took in order to determine whether or not the website belonged to an accelerator. Will continue extracting cohort information tomorrow. 11/22/2016 4:00-6:00 pm*Listed out all steps for extracting cohort information from the ten randomly chosen accelerators. Worked with Peter in order to build a way tool that will search all of the HTMLs and attempt to organize identify each one as an accelerator as well as extract some basic information. 11/28/2016 2:00-5:00 pm*Merged the F6S accelerator list with our other list, then posted it on the project page. Learned process for accelerator data extraction from Ed. 11/29/2016 4:00-6:00 pm*Began process of collecting data from the 20 accelerators that I am responsible for. 11/30/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished 15/20. 12/1/2016 4:00-6:00 pm*Continued collecting data from accelerators. Finished original 20, picked up a new set of 20. 12/2/2016 2:00-5:00 pm*Continued collecting data from accelerators. Finished next 20. 12/8/2016 1:00-3:00 pm*Completed collecting data from accelerators for the duplicate matchessemester.
[[Matthew Ringheanu]] [[Work Logs]] [[Matthew Ringheanu (Work Log)|(log page)]]
[[Category:Work Log]]

Navigation menu