Changes

Jump to navigation Jump to search
2,233 bytes added ,  12:31, 6 October 2020
no edit summary
{{Project|Has project output=Data,Tool|Has sponsor=McNair ProjectsCenter
|Has title=U.S. Seed Accelerators
|Has owner=Connor Rothschild,
|Does subsume=Accelerator Data, Accelerator Seed List (Data),
}}
<onlyinclude>The [[U.S. Seed Accelerators]] project subsumes several related projects. These projects were intended to assemble near-population data on high-growth high-tech seed accelerators in the U.S. and understand how to automate the data collection process. As such, the project includes both a dataset and prototypes. Some of the prototypes were used in the [[Kauffman Incubator Project]].</onlyinclude>
==Project Location==
The master file can be found at
/bulk/McNair/Projects/Accelerators/Summer 2018/'''The File to Rule Them All.xlsx'''
 
Note that TFTRTA-AcceleratorFinal.txt in E:\projects\accelerators was updated to included all creation dates and dead dates.
==Relevant Former Projects==
==Update for Hira==
 
===Final MTurk Push===
 
Minh and I pushed a final batch of HITs to MTurk. We found that, among our data even after MTurk, we were missing timing info for around 1000 companies. Upon further inspection, we realized that around 800 of these companies belonged to only ~10 accelerators. We think the problem was that Google searches most recent results first, so we missed out on old cohorts for large accelerators. We therefore re-ran Minh's crawler on these accelerators with different year parameters. We got 650 results.
 
Upon pushing these to MTurk, we got good results for 144 companies. This number was the product of filtering out accelerators with no companies listed, no date listed, and no accelerator listed (after searching manually). We removed duplicates and removed accelerators we do not care about. The 144 companies collectively have 1,538 companies.
 
This file can be found here:
/bulk/McNair/Projects/Accelerators/Summer 2018/Final Turk Push.xlsx
 
The next step is to plug this sheet into Grace's Python script which takes these companies and converts each company to its own row, so that it can be merged with our other data.
 
===Manual Searching===
 
For the other 170 companies we lacked timing info for (that were not worth crawling for because there were few companies assigned to each accelerator) McNair Center interns manually searched for timing info. Of the 170 companies we searched for, we found timing information for 128 of them.
 
The sheet can be found here:
https://docs.google.com/spreadsheets/d/1hGgxNwLph0tWtqO_8bNUGM-kzVXTeb-N26ojwL3TTuk/edit?usp=sharing
 
And is ready to merge in with our existing data.
===Recoded Founders' Experience===

Navigation menu