===Spring 2018===
<onlyinclude>
[[Shrey Agarwal]] [[Work Logs]] [[Shrey Agarwal (Work Log)|(log page)]]
1/23/18 15:00 - 17:00
*Became reacclimatized with the project, spoke with Ed about the direction for the rest of the semester
1/25/18 15:00 - 17:00
*Began examining the data on pulled webpages relating to demo days
1/26/18 13:00 - 17:00
*Began categorizing demo day pages based on: 1) relevance to accelerators, 2) relevance to the particular accelerator (got to 200)
1/30/18 15:00 - 17:00
*Continued working through the demo day pages, spoke with Ed about using the data to work a better set (got to 450)
2/01/18 15:00 - 17:00
*Finished the match and created pivot tables to count the number of repetitions (companies going through more than one accelerator)
2/06/18 15:00 - 17:00
*Discussed with Matthew the best way to collect the VC data from the repetitions. We tried different matches through our SDC data to no avail
2/08/18 15:00 - 18:00
*Continued attempting to match with SDC the different columns. Didn't work without separating the data into individual files, a very tedious process.
2/13/18 15:00 - 17:00
*Spoke with Ed about incubators project, will begin as soon as we can time the accelerator startup investments. Ed is expecting us to begin sometime in the next two months, using a similar process as we did for incubators. The process should be handled by a new worker.
2/15/18 15:00 - 17:00
*Talked to Ed about next steps for the project. Practiced accessing the CrunchBase database on SQL and brushed up on SQL code.
2/16/18 13:00 - 17:00
*Sifted through the database for Crunchbase investment information.
2/20/18 15:00 - 17:00
*Pulled the funding rounds table from SQL and matched it with the companies that have received VC funding in order to gather round dates.
2/22/18 15:00 - 18:00
*Went through the matched data. Brainstormed ways to get the dates for cohort companies going through accelerators.
2/27/18 15:00 - 17:00
*Looked into using the WhoIs Parser in order to find when the companies went through their accelerators.
</onlyinclude>
===Fall 2017===
<onlyinclude>
9/19/17 15:00 - 17:00
*Became reacclimatized with the project, spoke with Ed about the direction for the rest of the semester
9/20/17 15:00 - 17:00
*Worked on setting up a new pull for the updated SDC data
9/21/17 15:00 - 17:00
*Finished the pull and sorted the data from the updated accelerator list
9/22/17 15:00 - 17:00
*Tried to set up the matcher with Matthew; ran into some difficulties on Power Shell, returning a blank file in the output
9/26/17 15:00 - 17:00
*Finished the match and created pivot tables to count the number of repetitions (companies going through more than one accelerator)
9/27/17 15:00 - 17:00
*Discussed with Matthew the best way to collect the VC data from the repetitions. We tried different matches through our SDC data to no avail
9/28/17 16:00 - 17:00
*Continued attempting to match with SDC the different columns. Didn't work without separating the data into individual files, a very tedious process.
9/29/17 15:00 - 17:00
*Spoke with Ed about incubators project, will begin as soon as we can time the accelerator startup investments. Ed is expecting us to begin sometime in the next two months, using a similar process as we did for incubators. The process should be handled by a new worker.
10/02/17 15:00 - 17:00
*Talked to Ed about next steps for the project. Practiced accessing the CrunchBase database on SQL and brushed up on SQL code.
10/03/17 15:00 - 17:00
*Sifted through the database for Crunchbase investment information.
10/04/17 15:00 - 17:00
*Pulled the funding rounds table from SQL and matched it with the companies that have received VC funding in order to gather round dates.
10/06/17 15:00 - 17:00
*Went through the matched data. Brainstormed ways to get the dates for cohort companies going through accelerators.
10/11/17 15:00 - 17:00
*Looked into using the WhoIs Parser in order to find when the companies went through their accelerators.
10/12/17 15:00 - 17:00
*Discovered that the Wayback Machine will not be a good option for identifying the time when a company went through the accelerator. Created a list of VC Companies and their earliest round date. Included a column for the date they went through their accelerators and will fill it in when we find a good method of finding this date.
10/16/17 15:00 - 17:00
*Continued working on sorting VCCompanies by their earliest round date.
10/17/17 15:00 - 17:00
*Worked with Ben to find a solution to our problem of data acquisition. Finalized earliest round date for VCCompanies.
10/18/17 15:00 - 17:00
*Updated our VC data with Ed's help in order to increase the accuracy and completion of our data.
10/19/17 15:00 - 17:00
*Organized all of our matched data and updated it in order to reflect the most recent SDC pull with Ed. Matched Crunchbase data with our cohort companies.
10/20/17 15:00 - 17:00
*Generated the new list of VCCompanies as well as their earliest round dates.
10/23/17 15:00 - 17:00
*Worked on sorting out the discrepancies in our matched data.
10/24/17 15:00 - 17:00
*Went through list of VCCompanies and began adding respective accelerators in order to proceed with VCPercentage table.
10/25/17 15:00 - 17:00
*Continued going through list of VCCompanies and adding accelerators.
10/26/17 15:00 - 17:00
*Continued going through list of VCCompanies and adding accelerators. Will have this completed on Monday.
10/30/17 15:00 - 17:00
*Finished adding all of the accelerators to the list of VCCompanies. Added a column indicating whether or not the company went through two or more accelerators.
10/31/17 15:00 - 17:00
*Began compiling data in the column for the dates that a specific company went through an Accelerator.
11/01/17 15:00 - 17:00
*Finalized entering dates for Y Combinator cohort companies.
11/02/17 15:00 - 17:00
*Continued entering cohort company dates into Excel file.
11/06/17 15:00 - 17:00
*Began looking at keywords for identifying the cohort class dates for each company
11/07/17 15:00 - 17:00
*Received list from Peter with the accelerator founders matched from the Crunchbase LinkedIn URLs and proceeded to find the links for those founders without a match on Crunchbase. Data found in "Unfound Founders List" in the Fall 2017 folder
</onlyinclude>
===Spring 2017===
01/17/17 14:00 - 16:00
*Finished up "accelerating" from [[Accelerator Seed List (Data)]], numbers 341-351
1/18/17 14:00 - 16:00
*Finished accelerating for sure, went back and began an overview of the work done for quality control.
01/20/17 14:00 - 16:00
*Mandatory meeting, then worked through 2 of Ed's unfinished accelerators
1/23/17 14:00 - 16:00
*Worked with Matthew to go over about 70 items in the accelerator list and ensure that they follow a uniform structure and show correct information
1/24/17 14:00 - 16:00
*Worked with Peter to fix the problem with results not coming through on the new spreadsheet by renaming the file and including more symbols in the searches. Spreadsheet should be up to date now.
*Got to number 144 on the list while going through files.
1/25/17 14:00 - 16;00
*Continued looking through the list and fixing wrong entries or reporting them
1/26/17 14:00 - 16:00
*Talked with Ed about project going forward and tried to access the Crunchbase API with Peter to crawl for start-up companies.
*Continued working through the accelerator list, stopped at number 186.
1/27/17 14:00 - 16:00
*Continued looking through accelerator list and fixing any entries with error. Got to number 261.
1/30/17 14:30 - 16:30
*Got through about 425
1/31/17 14:00 - 16:00
*Got to number 502
2/01/17 14:00 - 16:00
*Finished looking through the initial list of accelerators and writing down which ones needed to be modified or completed (through 551)
2/03/17 14:00 - 17:00
*Finished about 30 entries for the accelerator entries that still needed to be completed. Worked out of the "NOT DONE" file in the server (which is now blank because everything is finished)
2/06/17 14:00 - 16:00
*Developed a standardized format for the text files with Matthew. Instructions are under "standardized format" in the accelerator seed list portion. I started at number 226 and standardized formats up until 370.
2/07/17 14:00-16:00
*Continued work from yesterday, completed up to number 488 from the list. Will likely need one more day to finish.
2/08/17 14:00 - 16:00
*Finished standardizing the txt files for use on the excel spreadsheet, compiled the data and examined the resultant tables. Realized we needed to fix some categories in the cohort files.
2/09/17 14:00 - 17:00
*Worked with Ed on a side project trying to gather information on climate change thanks to Baker's article on the Wall Street Journal
*Gathered information on climate change in relation to high-growth, high-risk innovation and organizations that deal with things such as carbon credits
2/10/17 14:00 - 17:00
*Realized that blog post was ambitious because we could not really find a clear purpose from the information we gathered, nor could we find a unique angle. Held off on the idea
*Went back to organizing the new columns and headers on the text file by identifying areas of error in the excel spreadsheet
2/15/17 14:00 - 16:00
*Spoke with Ed about free enterprise while he lectured all of us. It took about an hour.
*Looked at plans for project going forward including using linkedin to search the founders
2/20/17 14:00 - 16:00
*Found our first source for expanding the project into incubators, from angel.co. Seems similar to f6s in that we can crawl it and obtain a list of incubators and their various counterparts.
2/21/17 14:00 - 16:00
*Found more sources for incubators by reading through quora discussions and masters theses. Bookmarked these pages so that I could put them into text files after.
2/23/17 14:00 - 18:00
*Converted incubator files to text-pad and saved them (4 total), then cleaned them up through regex
*Took the cohort text file, put it into excel, and proceeded to clean up all of the mistakes in the excel document, particularly bad data or mistakes with organizations. Got through Y-Combinator.
2/24/17 14:00 - 16:00
*Finished up cleaning the cohort data for the names and the descriptions, but there still needs to be work done on the other stuff like dates and programs
2/28/17 14:00 - 16:00
*Created page [[Hub-Based Venture Firms]] and proceeded to research VC in Hubs listed on under E:\McNair\Projects\Hubs\summer 2016\Hubs Variables - Ariel.xls
*Looked at details such as whether they have in-house funds, whether they co-invest, focuses, and amounts invested.
3/01/17 14:00 - 16:00
*Worked with Ben and Matthew to pull data from SDC for all VC funded companies and normalized it to put it in an Excel document.
3/02/17 14:00 - 16:00
*Tried to repeat the VC data pull without it crashing from pulling too many entries. Unfortunately, we were unable to finish it
3/06/17 14:00 - 16:00
*Worked with Matthew to put final touches on the cohort data to prep it for matching with our VC data
3/07/17 14:00 - 16:00
*Finally finished working on the cohort files, will match on the 8th
3/08/17 14:00 - 16:00
*Matched the VC Data with the list of Cohort Companies and got one list of all cohort companies that have received VC funding.
3/20/17 14:00 - 16:00
*Participated in a SQL training session with Ed, learned how to create a database and to pull tab delimited information from text files onto a table
3/21/17 14:00 - 16:00
*Met with Ed and arrived at the conclusion of finishing the draft for a report by the end of the semester. Put the initial report information on the accelerator page using the variables that we currently have
3/22/17 14:00 - 16:00
*Worked with Matthew to compile tables in our database of the matched VC-portfolio company lists and the overall accelerator cohort information. Found multiple errors in the cohort file which needed to be fixed before finishing the tables and analyzing the data
3/23/17 14:00 - 16:00
*Finished cleaning the cohort file once again.
3/24/17 14:00 - 16:00
*Continued practicing my SQL and creating the code for compiling the tables
3/29/17 14:00 - 16:00
*Worked on the matched data with Matthew. Will run the RegEx code that will filter the URLs, and I will look through the duplicates where two different VC-backed company names matched to one cohort company name
3/30/17 14:00 - 16:00
*Examined the Regex code for the URLs and attempted to filter them out
4/03/17 14:00 - 16:00
*Continued learning some SQL from Ed
4/04/17 14:00 - 16:00
*Began examining the Crunchbase data; looked through the 2013 snapshot
*Created a new Crunchbase account with McNair center and examined the basic access, which does not give us much information
4/05/17 14:00 - 16:00
*Made the final VC percentage table from our database and previous code with Ed; realized we were missing many accelerators as well as a lot of important cohort data so need to reexamine our previous data.
4/06/17 14:00 - 16:00
*Continued looking through Crunchbase to see how we can pull accelerators up until 2013; most likely will use objects to sort the data into accelerators, perhaps keywords from "accelerators"
4/07/17 14:00 - 16:00
*Examined SARP and attempted to match their accelerators with the ones from our data, realized that a few of our cohorts were missing as well as a few of the actual accelerators so we need to fix the data in our excel file
*Began compiling a list of missing accelerators on textpad to later insert into our excel.
4/10/17 13:00 - 16:00
*Worked with Ben to find missing accelerators from the Crunchbase data using the keywords. Also, began recording information from some of the big accelerators we were missing
*Found 228 matches for accelerators, will match from our list to find the similarities
4/11/17 14:00 - 16:00
*Finished compiling the accelerator and cohort information for the few we found from SARP, will consult Ed to figure out how to approach the missing accelerators and what to do for the preliminary report
===Fall 2016===
09/27/2016 14:00 - 17:00:
*Set up personal and work log pages, accessed Remote Desktop.
*Used google searches to identify more sources, and evaluated three databases with the help of TextPad
*Began working on more generic google searches. Was able to go through "Location+accelerator"-type searches today. Will continue next time.
[[Category:Internal]]
10/18/2016 14:00 - 17:30;
*Work continued in [[Accelerator Seed List (Data)]]
12/08/16 14:00 - 17:00
*Continued working on accelerator list on the same page.
01/17/17 14:00 - 16:00*Finished up "accelerating" from [[Accelerator Seed List (Data)Category:Work Log]], numbers 341-351