Changes

Jump to navigation Jump to search
no edit summary
{{Project|Has project output=Data,Tool|Has sponsor=McNair ProjectsCenter|Project TitleHas title=Accelerator Seed List (Data),|Topic Area=Entrepreneurship Ecosystems,|OwnerHas owner=Shrey Agarwal, Matthew Ringheanu, Veeral Shah, Connor Rothschild,|Start TermHas start date=Fall 2016,|KeywordsHas keywords=Accelerators,Data|Primary BillingHas project status=AccMcNair01,Subsume|Is dependent on=Industry Classifier
}}
=Current Work=
 
===As of 05/21/2018 the Google Sheet Workbook has been downloaded to the E drive. The now Excel Workbook is saved at E:\McNair\Projects\Accelerators\Summer 2018\Accelerator Master Variable List.xlsx. This is now the master file.===
 
Google Master Sheet: https://docs.google.com/spreadsheets/d/1ikuxYwp9JIRrjz4qQcbdwTpbHOne-q2PterYTjzofjw/edit?ts=5aa2f1f9#gid=0
*Cross-reference sheet with data from Peter's old accelerator consolidation file ("accelerator_data_noflag" and "accelerator_data" in "All Relevant Files") and fill in missing data
*Variables that are 100% NOT in these 2 files:
**Cohort Breakout?
**Subtype
**Designed for Students?
**Campuses
**Stage
**Software Tech
**What stage do they look for?
 
TODO:
McNair/Projects/Accelerators/Fall 2017/unfound_founders.txt
A 0 means we don't have founder data for that accelerator.
Specs: A tab delimited text file with the following fields:
Accelerator First Name Last Name LinkedInURL(if possible)
Getting the LinkedInURL will ensure accuracy, but will work without it.
 
 
*Shrey: Find "demo day" keywords, so that we can search AcceleratorName Year Keyword and get back potential demo day pages
 
 
==Accelerator Type project==
 
File to edit is called "Accelerator type list". Located in the folder E:\McNair\Projects\Accelerators\Spring 2018\Grouping project of ListOfAccs. More systematic information and instructions are in"Instructions for Accelerator type project" in E:\McNair\Projects\Accelerators\Spring 2018\Grouping project of ListOfAccs.
 
NOTE: until we get through all 270 accelerators, we will just categorize each accelerator into the following three categories as quickly as possible with short notes in teh "other info" column for these; once we have this, we will go back through the ones that aren't categorized and add notes to the "other info" column.
 
 
Type list:
*Private
*Corporate
*Academic
Note: if DEAD, noted here.
 
 
Other info:
*nonprofit? (y/n)
 
*Subtype abbreviations:
**S: for if a social entrepreneurship initiative
**I: for if an incubator
**A: for an angel group
**F: for foreign
**C: for in coworking space/hub/etc
**V: for if part of venture fund
**G: for if government funded/partnered
**T: for international
 
 
Note: subtypes (from individual text files in E:\McNair\Projects\Accelerators\Spring 2017\Code+Final_Data) were only found for 23 of the 270 accelerators. These accelerators were initially intended to be removed from the master list. Remaining subtypes are currently being added.
 
other info:
 
international offices, founders, industries, org type, program duration, or other interesting, easily accessed variables. Additional information is especially important for accelerators that have no other subtype abbreviation listed.
 
 
===Steps to research an accelerator===
 
1. Copy/paste URL listed in Accelerator type list file into google. If website is insufficient, try googling:
the name of the accelerator
the name of the accelerator + "crunchbase"
the name of the accelerator + "nonprofit"
 
the above steps sometimes lead to other helpful databases/news articles
 
2. Note whether:
1) Academic/Corporate/Private
2) For Profit/Nonprofit. Sometimes this isn't directly stated but can be inferred through their description of, say their investment process. If they don't address this at all it's probably For Profit.
3) subtype (S, I, A, F, C, V, G, T).
4) Additional, easily-accessed info. Number 4 is really important if there's no subtype.
 
All 270 need to be done by the end of the semester.
 
 
Type list file saved as
"Accelerator type list" in E:\McNair\Projects\Accelerators\Spring 2017\Grouping project of ListOfAccs.
The list of ListofAccs, from which we drew Accelerator type list, should have no matches with any of the flagged accelerators in E:\McNair\Projects\Accelerators\Spring 2017\Code+Final_Data. There are 23 matches though. So all subtypes must be searched and entered manually. Whether some were a nonprofit was listed in E:\McNair\Projects\Accelerators\Spring 2017\Grouping project of ListOfAccs, called "whether nonprofit...". Accelerators with no info there on whether nonprofit need to have info entered manually.
 
=Funded By Accelerators=
 
Reference the like-named portion in [[Crunchbase Data#Funded by Accelerators|Crunchbase Data]]
 
=End of Semester Report=
The end of semester report will focus on ranking accelerators and environments based on the variables we have gathered. Our primary form of categorization will be ranking individual accelerators based on their venture capital raise rate. We can probably generate information over time for accelerators and the amount of VC they raised to get a sense of what locations have developed in the past five years from the dates of transactions recorded by SDC. To obtain these rankings, we will identify which cohorts companies were trained in, as well as complete details of the accelerator and the details of cohort companies. We will focus only on accelerators because there are many other entities in each ecosystem. We will also utilize information on IPO or acquisition by companies, obtained through Crunchbase, to gain some sense of how successful startups emerging from a particular accelerator are. To obtain the data over time, we will need to fill out the cohort date information column in our cohort data, which will require the help of either Crunchbase or the Wayback machine for older accelerators. In ranking the accelerators across regions, we can also track industry-specific hotspots for accelerators such as medicine in Memphis or technology in San Francisco.
 
To complete the report, we need to fill information in:
*Industry and focus
*Location
*Name, description
*Matched VC data
*Founder information (maybe)
 
=Overview=
This project is developing broad and near-population data on accelerators and their cohort companies. The objective is to identify which cohorts of which accelerators a cohort company was trained in, obtain details of the accelerators, and obtain details of the cohort companies, including information about any venture capital investment that the cohort company might have received and any IPO or acquisition the company may have experienced.
 
The primary use of this data is for an academic paper detailed on the [[Matching Entrepreneurs to Accelerators and VCs (Academic Paper)]] page.
 
However, this project can also provide useful data to other academic papers ([[Urban Start-up Agglomeration]], [[Hubs (Academic Paper)]], and [[Hubs Scorecard (Academic Paper)]]), projects ([[Houston Entrepreneurship]]) and blog posts (under the [[Emerging Ecosystems]] umbrella project).
 
This project needs the results of the [[Industry Classifier]], [[Whois Parser]], and other tools.
 
=Current Project Write-Up=
 
==Things To Do==
*Obtain all URLs for accelerators in order to run through the Wayback Machine to find out when they started.
*Match Crunchbase Data with our Accelerator List to see if they have any accelerators that we do not.
*Obtain an example of accelerator that started early and has multiple companies but does not separate them into cohorts and figure out a way to determine which companies went through each cohort.
 
==What Each File in the "Accelerator" Folder on the RDP Contains==
*"Accelerator List Sources" (Folder) - This folder contains most of the sources that we pulled accelerator names from at the very beginning of the project.
*"Code+Final_Data" (Folder) - This folder contains Peter's code for pulling the data from the text files in the "Data" folder.
*"Crunchbase Snapshot" (Folder) - This folder contains the data we obtained from Crunchbase. There is a massive amount of data which we will need to sort through to find useful information and hopefully match that data with our current cohort data.
*"Data" (Folder) - This folder contains all of our data on accelerators including cohort information and the html files of each cohort page. I would estimate that it is about 95% clean currently.
*"Data - Copy" (Folder) - This is just a copy of our current "Data" folder.
*"Data_Copy" (Folder) - This is a copy of our original "Data" folder before we did any manual cleaning.
*"Enclosing_Circle" (Folder) - This folder seems to contain some data on VC but I'm not sure how it pertains to the Accelerator project.
*"F6S Accelerator HTMLs" (Folder) - This folder contains the HTML pages of all the pages on the F6S website. We used it to add more potential accelerators to our list.
*"Google_SiteSearch" (Folder) - This folder contains Python code for Google searches.
*"Industry_Classifier" (Folder) - This folder seems to contain Python code but I'm not sure what for.
*"Matcher" (Folder) - This folder contains the Matcher.
*"Python WebCrawler" (Folder) - This folder contains code that is a work in progress for pulling descriptions from accelerator websites. It is Jeemin's project.
*"Cleaned Cohort Data Copy" (Excel File) - This file contains a copy of our cleaned cohort data.
*"Cleaned Cohort Data" (Excel File) - This file contains the most current, completely cleaned data on cohort company information.
*"NormalizeFixedWidth" (PL File) - This is the normalizer.
*"PortCoNames" (TXT File) - This file contains all of the names of the cohort companies as well as the accelerator they went through.
*"VC Data" (Excel File) - This file contains all of the names of the companies that have ever received VC funding.
*"VC_Data" (TXT File) - This file contains that non-normalized data of all of the VC information.
*"VC_Data_Names" (TXT File) - This file contains all of the names of companies that have received VC funding.
*"VC_Data_Names_Matched_PortCoNames" (Excel File) - This file contains all of the cohort companies that have also received VC funding. Still needs to be sorted through.
 
==Process==
After accumulating the massive amount of data on accelerators, their cohorts, and their html files, we began cleaning those text files, which are located in the "Data" folder within "Accelerators". After going through the first round of cleaning, we ran a code through the cohort data which put all of that information into an Excel document called "Cleaned Cohort Data". There were still some mistakes in the cohort information unfortunately, which we fixed within the Excel file itself. Therefore, there are some text files within the "Data" folder that do not match with the "Cleaned Cohort Data" file. If we were to run the cohort code through the "Data" folder, we would get something that does not match with the "Cleaned Cohort Data" file, which is problematic. The solution to this (other than manually cleaning the text files again) would be to write a code from the "Cleaned Cohort Data" file which would allow us to clean the data in the "Data" folder through the format of the Excel file. We have also matched all of the cohort companies with our list of all companies that have received VC funding.
 
=Current To Do=
 
#Work on the [[Crunchbase 2013 Snapshot]]
#Match cohort companies to VC-backed portfolio companies
#Refine our data to work out which cohort each cohort company was a member of, cohort start dates and locations, etc.
#Make a list of top accelerator lists (e.g., http://tech.co/top-startup-accelerators-ranked-2012-08) and check that we have those accelerators
 
=End of Semester Notes=
 
*We have compiled a very long list of accelerators from many different databases. For the past couple of weeks, everyone in the center has been going through this list, 20 at a time, classifying each one as an accelerator or not an accelerator, and then proceeding to gather data on the accelerator using the process outlined below. This process went very smoothly. We have successfully gone through about 80% of the list. We are still missing information on the last hundred or so names. All of the collected data is located on the RDP, within the "Accelerators" folder under "Data" or on the [https://docs.google.com/spreadsheets/d/1ikuxYwp9JIRrjz4qQcbdwTpbHOne-q2PterYTjzofjw/edit?ts=5aa2f1f9#gid=1132417337 "Accelerator Master Variable List" Google sheet].
*We have listed all of the startups from the accelerators that have break out cohorts on their website on the [https://docs.google.com/spreadsheets/d/1ikuxYwp9JIRrjz4qQcbdwTpbHOne-q2PterYTjzofjw/edit?ts=5aa2f1f9#gid=1132417337 "Accelerator Master Variable List" Google sheet]. This contains the following information in the "Cohort List (new)" sheet: accelerator name, year, cohort name, company name, description, founders, category/sector, and location.
*Next steps include going through the demo day pages that have been downloaded and writing notes on the different types if possible (see [[Demo Day Page Google Classifier]]).
 
=Data Collection Notes=
 
==MATCHING==
 
The files we used to match are located in the E drive. We used the matcher to match our portfolio company names from the cohort file located in E:\McNair\Projects\Accelerators.
*The files used to matching are located E:\McNair\Projects\Accelerators\Matcher
*Portco is the name of the companies pulled from the cohort file
*AccCo includes both the cohort company name, along with the name of the accelerator itself
*In the matcher, the inputs are the PortCo names, as well as the VC data from our pull in SDC
*The outputs include the AccCo_VC data located in E:\McNair\Projects\Accelerators which give a lot of information on the matches, including:
:*name of the match itself
:*number of investments
:*dates that the company received its investments
 
==SDC Pull==
 
We accessed SDC platinum and pulled information on round-based funding that all registered companies received from between the years 1999 to 2017.
 
The receipt is as follows:
 
Session Details
---------------
Request Hits Request Description
0 - DATABASE: Portfolio Companies (VIPC)
1 96155 Venture Related Deals: Select All Venture Related Deals
2 79572 Round Date: 1/1/1999 to 3/1/2017 (Custom) (Calendar)
3 Custom Report: VC Data (Columnar) - Save As:
E:\McNair\Projects\Accelerators\VC Data.txt
Billing Ref # : 2054025
Capture File : riceuniv.2054025
Session Name :
 
The VC data pull includes the following variables:
 
Company Name Date Company Date Company Company Company City Company Street Address, Line 1 Company Street Address, Line 2 Total Known Company Industry Sub-Group 3 Company Industry Major Group Round Company Stage Level 3 Round Amt, Round Amt,
 
==3 files==
 
For each accelerator in the list, put files in E:\Projects\Accelerators\Data
*AcceleratorName.txt - copy and paste the variables below into a (tab-delimited) txt file and complete
*AcceleratorName.cohort - your cohort text file (see below)
*AcceleratorName.html (possibly automatically with a folder too) - save a copy of the html of the cohort page
 
==.txt Variables==
 
Name
Score
Flag
CohortURL
Address
Duration
Vintage
Industry
Description
Equity
NonProfit
Notes
 
 
Try to get '''Name, Score, Flag, Cohort URL and Address''' for all. ONLY GRAB OTHER VARIABLES IF EASY. Just leave things blank if you can't find them quickly.
 
'''If the score is 0, or the flag is S, I, A, or F just stop''' - don't bother downloading a cohort list, saving an HTML file, etc. If possible, do stick a very brief description of the problem in the notes field.
 
Notes:
*Score: is 0-1 where 0 is definitely not an accelerator, 1 is definitely an accelerator
*Flag: (leave blank if not needed), if multiple then separate by comma
**S for social entrep
**I for incubator
**A for an angel group
**F is for foreign
**C for in coworking space/hub/etc
**V for if part of venture fund
**D is for Dead
*Put just the root URL in Cohort URL if there isn't a Cohort page
*Duration: in wks (months x 4.33 and round)
*Vintage is year of first cohort if possible
*Industry is industry focus but only if clear focus
*Equity is a number (don't put %) or Y/N
*Notes is only there if need it. Particularly try to use this field to note discards.
 
==.cohort files==
 
Your .cohort files must:
*Be tab delimited txt
*Have a header
*The first column must be the portfolio company name
*Grab as many columns as you can easily (and name them)
 
==Standardized format for text files==
 
Information Text file
*1 tab only after each category
*No spaces after commas for flags or industry
*For duration put only a number in weeks but do not write "weeks"
*Equity is either only a number (no percent sign) or a Y/N
 
 
Cohort Text file
*1 tab between each column
*Titles of each column on top
*Make a new category for "Cohort Number" and write either "1 2 3 4 etc."
*Matthew: 1-225 (done) Shrey: 226-550 (done)
 
==Link to Crunchbase API application==
 
https://about.crunchbase.com/forms/research-access-apply/ (Does not work anymore)
 
https://data.crunchbase.com/v3/docs/using-the-api (Has new instructions for application)
 
==Sign-Ups==
 
Ed - 1-10 (done)
Carlin - 11-20 (done)
Carlin - 21-40 (done)
Christy - 41-60 (done)
Avesh - 61-80 (done)
Eliza - 81-100 (done)
Meghana - 101-120 (done)
Peter - 121-140 (done)
Ramee - 141-160 (done)
Will - 161-180 (done)
Matthew - 181-200 (done)
Julia - 201-220 (done)
Peter - 221-240 (done)
Shrey - 241-260 (done)
Matthew - 261-280 (done)
Eliza - 281-300 (done)
Julia - 301-320 (done)
Shrey - 321-340 (done)
Carlin - 341-361 (done)
Julia - 362-380 (done)
Dylan - 381-393 (done)
Jake - 394-404 (done)
Dylan - 405-410 (done)
Avesh - 411-415 (done)
Dylan - 416-423 (done)
Peter - 424-460(done)
Carlin - 461-480 (done)
Peter - 481-490(done)
Julia - 491-510 (done)
Peter - 511-515 (done)
Julia - 516-529 (done)
Ben - 530-540 (done)
Shrey - 541-551 (done)
=List of Accelerators=
1. #10Xelerator2. #14403. 17764. #33entrepreneurs5. 3DS Princeton University Spring 20146. #500 Startups7. 9Hive8. #9Mile Labs9. #AIA Accelerator10. #ARK Challenge11. #AT&T Aspire Accelerator12. #ATDC Community13. #AZ TechCelerator14. #AccelFoods15. #Acceleprise16. #Accelerate Baltimore17. #Accelerate Genius18. #Accelerate Tectoria Accelerator19. #Accelerator Centre20. #Advanced Technology Development Center (ATDC)21. #Airbus BizLab22. #Alchemist Accelerator23. #AlphaLab24. #Amplify.LA25. #Angel Capital26. #Angelcube27. #Angelpad28. #Annual Business BootCamp29. #Arizona Center for Innovation30. #Arizona Furnace31. #Arrowhead Tech Incubator 201632. #Aspire 3 Accelerator 201733. #Atlanta Ventures Accelerator 34. #AutoXLR8R35. #Awesome Inc.36. #Axel Springer Plug and Play37. #B 4 Change Impact Accelerator38. #B2B Acceleration Program39. #B4C Social Venture Accelerator40. #BBC Worldwide Labs41. #BMW Startup Garage42. BRANDCELERATE#Brandcelerate43. BUNKER #Bunker Labs New York44. #Bank of Ireland Accelerator Programme45. #Bantunium Labs Accelerator46. #Barclays Accelerator47. #Barclays New York Summer 201548. #Berkley Ventures49. #Bessemer Business Incubation System50. #Beta-i51. #Beta.MN52. #BetaFactory53. #BetaSpring54. #Betablox55. #Betaspring RevUp (DUPLICATE)56. #Bethnal Green Ventures57. #BioAccel58. #BioInspire59. #Bir 201560. #BitAngel Engagement Level61. #BitAngels Startup Summer Program of 201362. #Bizdom63. #Black Forest Accelerator64. #Blue Startups65. #Blueprint Health66. #Bolt Boston67. #Bonnier Accelerator68. #BoomStartup69. #BoomStartup Winter 2017(DUPLICATE)70. #Boomtown Accelerator71. #Boomtown Health Tech(DUPLICATE)72. #Boost VC73. #BootupLabs74. #Brandery75. #Brooklyn Beta Summer Camp76. #Budweiser Dream Brewery77. #Buildit78. #BuiltinPGH Companies79. #Business Innovation Center80. #Business Opportunity Academy 201781. #Business Technology Development Center (BizTech)82. #CLT Joules Energy Accelerator 201483. #CWI Ventures84. #CWI Ventures Application(DUPLICATE)85. #CableLabs Technology Tours 201686. #Capital Factory87. #Capital Innovators88. #Capital Investment Network (Startups)89. #Caroline Plouff90. #Catalyst Partners91. #Cause Collective : Social Innovation Lab92. #Center for Entrepreneurial Innovation93. #Chain Reaction Innovations 201794. #Chemical Angel Network95. #Chinaccelerator96. #Cisco Entrepreneurs in Residence97. #Citi Accelerator98. #Citrix Startup Accelerator99. #Claremont/Upland Makerspace Fablab100. #Climate Ventures 2.0 Accelerator101. #Co.Lab accelerator102. #Code for America Accelerator103. #Cohab's Traxtion Point104. #Collision Conference Investors105. #Common Bond106. #Communitech Hyperdrive107. #Conquer Accelerator108. #Coolhouse Labs109. #CuriousMinds Incubator / Accelerator110. #CyberTECH San Diego111. #DBS Accelerator112. #DPD Last Mile labs113. #DV X Labs114. #Dat Ventures115. #Decatur-Morgan County Entrepreneurial Center116. #Deep Space Ventures117. #Demo Accelerator 2016- 2017118. #DeveloperTown119. #Difference Engine120. #Digital Malaysia Corporate Accelerator Program121. #Digital Media Zone Incubator/Accelerator122. #Disney Accelerator123. #DogFish Accelerator124. #Domi Station125. #Dotforge accelerator126. #Dream Funded127. #DreamIT Health128. #DreamStart - Free Mentoring Program129. #Dreamit Ventures(DUPLICATE)130. #Ducky Diggy Lloyd 131. #E-Capital Summit132. #EC Mentor Skills Inventory133. #EIGERlab134. #ETRAC135. #EY Startup Challenge136. #Eco Holding137. #Eleven Startup Accelerator138. #Emerge Xcelerate139. #EnterpriseWorks Incubation Program140. #Entrepreneur Development Center141. #Entrepreneurs Roundtable Accelerator142. #Environmental Business Cluster143. #Equity Legal144. #Excelerate Labs145. #Execution Labs146. #Exhilarator147. #Extreme Startups148. #Extreme University149. #FOOD-X150. #Factory45151. #Fargo Startup House 2014-2015152. #FastTrack Propero Healthcare153. #FbFund154. #Female Propeller for High Flyers155. #FinTech Innovation Lab156. #FinTech Studios 2015157. #Fintech Founders Club #2158. #First Growth Venture Network159. #Fishbowl Labs AOL160. #Flagship Enterprise Center161. #FlashStarts162. #Flashpoint163. #Flat6 Labs164. #Fledge9165. #Flextronics Lab IX166. #Food Future Scale-up Accelerator 2017167. #Food System 6 (FS6) Accelerator168. #FoodForwardX169. #Fortify Ventures170. #Founder Institute171. #FounderFuel172. #FoundersPad173. #Fownders Accelerator174. #French Accelerator 2016175. #Fund the Food176. #Fuse Corps Host177. #GAKKEN Accelerator Program178. #Gainesville Technology Enterprise Center179. #Game CoLab Incubator Program 2014180. #GameFounders181. #GammaRebels182. #Gazelle Lab183. #Gener8tor184. #German Accelerator Life Sciences185. #German Accelerator Tech186. #Global Accelerator Network 2015187. #Good Works Houston Lab188. #GoodCompany Ventures189. #Google Launchpad Accelerator190. #Grants4Apps Accelerator191. #GreenStart192. #Greenlite Labs193. #GrowLab194. #Growth Hacking Accelerator 2015195. #Gulf Coast Center for Innovation and Entrepreneurship196. #H-Farm Ventures197. #HACKT Mission for International Founders198. #HAXLR8R199. #HCC Entrepreneurship Launchpad200. #HIGHLINE Academy201. #HUB202. #HUBB Accelerator203. #HUBB GTLA 2016204. #HackFWD205. #Hatch206. #Health Wildcatters207. #Health accelerator208. #Healthbox209. #Hero City Co-Working Space210. #High Street Startups Accelerator211. #Highway1212. #Honda Xcelerator 213. #Houston Technology Center214. #Hub Ventures215. #HugeThing216. #I/O ventures217. #ICONYC labs218. #IDC Elevator219. #INcubes Funnel and Accelerator 2014/2015220. #INcubes Online Form221. #INcubes Startup Visa222. #Illumina Accelerator223. #Illuminator, New York Accelerator 2015224. #Imagine K12225. #Immokalee Business Development Center226. #Impact Engine227. #Impact USA - 2017228. #Incubate Miami229. #Infuse Accelerator230. #Ingenuity Partner Program231. #InnoSpring232. #Innov&Connect233. #Innov8 for Health234. #Innova Memhis ApplicationMemphis235. #InnovateOC236. #Innovation Depot237. #Innovation Pavilion238. #Innovation Showcase Winter 2017239. #Insight Accelerator Labs240. #Intel Education Accelerator241. #Investment Preparedness Lab242. #Invoke Collective243. #Iowa Startup Accelerator244. #JFDI.Asia245. #JFE Accelerator SF246. #JLAB247. #Jaguar Land Rover Tech Incubator248. #Jolt249. #JumpSchool 250. #JumpStart Foundry251. #Jumpstart! Boulder252. #JusticeXL253. #Kairos Boston Spring Program254. #Kaplan EdTech255. #Kick256. #Kick Boise257. #Kick LA258. #Kick Victoria259. #Kicklabs260. #Kinetiq Labs261. #L-SPARK Accelerator262. #LAUNCH incubator263. #LAUNCHub264. #LI TechCOMETS265. #LabFunding Project Accelerator 2014266. #Labs Venture Accelerator267. #Launch Chapel Hill268. #Launch Memphis269. #LaunchBox Digital270. #LaunchHouse271. #LaunchPad PEI272. #LaunchSpot273. #Launch_Academy274. #Launchpad Digital Health, LLC275. #Launchpad LA276. #Launchpad Long Island277. #Le Camping278. #Leading Entrepreneurial Accelerator Program279. #Lean Launch Ventures280. #LearnLaunchX281. #Lemnos Labs282. #Life Changing Labs283. #LiftOff Health Incubator284. #Lightbank Start285. #LightningLab286. #Lowe's Accelerator287. #MACH37288. #MACH37 Spring289. #MIT SA+P venture accelerator290. #MITA Institute Accelerator291. #MTGx MediaFactory292. #Mac6293. #Madworks Governance Accelerator294. #Maine Center for Entrepreneurial Development - Top Gun Program295. #Matter296. #Maven Ventures Fund & Incubator297. #Media Camp298. #Melbourne Accelerator Program299. #Memphis BioWorks300. #Merck Accelerator301. #MergeLane 2017 Accelerator302. #Mergelane303. #Metavallon304. #Microsoft Accelerator305. #MindTheBridge306. #Momentum307. #MuckerLab308. #Muru-D309. #My5ive Accelerator 2016310. #N-Motion(DUPLICATE)311. #NDRC (LaunchPad / VentureLab)312. #NEXT Dashboard313. #NMotion314. #NY Digital Health Accelerator315. #NY Fashion Tech Lab 2017316. #NYC ACRE317. #NYC SeedStart318. Nalukai319. #Nashville Entrepreneur Center320. #Nebula Shift321. #Nephoscale IaaS322. #Nest New York 323. #New Ventures Group324. #New York Digital Health Accelerator(DUPLICATE)325. #NewME Accelerator PopUps 326. #NewMe327. #Next media accelerator328. #NextHIT329. #NextStart330. #Nike+ Accelerator331. #Northern Arizona Center for Entrepreneurship and Technology (NACET)332. #Northern England333. #Nxtp.labs334. #OCTANe Launch Pad335. OMNIVERSIS, LLC336. #Oasis 500337. Open Education Challenge338. #OpenFund339. #Orange Fab340. #Orange Works341. #Orion Startups342. #Oxygen Accelerator343. #PIE344. #Patriot Boot Camp345. #Pearson Catalyst for Education346. #Pipeline H2O347. #Pitney Bowes Inc348. #Plarium Labs349. #Plug In South LA 350. #Plug and Play351. #Plum Alley Investments 2016352. #Points of LightAccelerator353. Portland Incubator Experiment (PIE)#PowerHaus354. Portland Seed Fund355. PowerHaus356. #Preccelerator® Program 2016357. #ProSiebenSat.1 Accelerator358. #Project Entrepreneur 2016/17359. #Project Healtchare360. #Project Lift361. #Project Music362. #Project Skyway363. #Propeller Venture Accelerator364. #Prosper Capital Accelerator365. #Proton Enterprises366. #Pushstart Accelerator367. #Qualcomm Robotics Accelerator368. #Queen Creek Business Incubator369. #R/GA Accelerator370. R/GA Marketing Tech Venture Studio371. #RAIN Incubator/Accelerator372. #RJI Investment Group373. Rackspace Startup $24k Program374. #Reach375. #RetailXelerator376. #Rock Health377. #Rocket Fuel Labs Application378. #Rockstart Accelerator379. #RunUp Labs380. Runway381. #Runway IoT Accelerator 2015382. #SAP Startup Focus Program383. #SKTA Innopartners Innovation Accelerator384. #SPACELAB Tech Accelerator385. #SPARK386. SPARK Holyoke387. #SPH Plug and Play388. #SURF Incubator389. #SaltMines Group Start-Up Studio390. #ScaleTown391. #Seamless IoT 2016392. #Searchcamp393. #Seed Hatchery394. #SeedSpot395. #SeedStartup396. #SeedSumo397. #Seedcamp398. #Seedrocket399. #Seeqnce400. #Sequoia Apps401. #Serval Ventures402. #Shenzhen Valley Ventures Incubator403. #Shoals Entrepreneurial Center404. #Shopper Futures Accelerator405. #Shotput Ventures406. #Sid Martin Biotechnology IncubatorInstitute407. #SigmaLabs Accelerator408. #Silicon Valley Incubator & Accelerator409. #SixThirty410. SixThirty CYBER Spring 2017#Sixers Innovation Lab411. SixThirty Spring 2017412. Sixers Innovation Lab413. #Skywalker Accelerator414. #SmartHealth Activator415. #Smashd Labs416. #SoCo Nexus Accelerator Spring 2017417. #Social Enterprise Challenge418. #Socratic Labs419. #SparkLabs420. #Sparkgap421. #Sports Tank422. #Springboard423. #Sprint Accelerator424. #Sprint Accelerator 2017425. Sprint Mobile Mobile Health Accelerator426. #SproutBox427. #SproutCamp428. #Starburst Aerospace Accelerator429. #Start Path Europe430. #Start'inPost431. #StartEngine432. #StartFast Venture Accelerator433. #Starta Accelerator Winter 2017434. #Startl435. #Startmate436. #Startup Accelerator(DUPLICATE)437. #Startup Front438. #Startup Next & GAN439. #Startup Orange County Accelerator440. Startup Quest: Virtual Startup Incubator441. #Startup Runway Atlanta Spring 2017442. #Startup Wise Guys443. #Startup Zone PEI444. #Startup52X Accelerator445. #StartupCity446. #StartupHighway447. #StartupHouse Foundry program448. #StartupMinds Accelerator 449. StartupMonthly#StartupYard450. StartupNLA Catalyst Startup#Startupbootcamp451. StartupReykjavik452. StartupYard453. Startupbootcamp454. #Straight Shot455. #Summer@Highland456. #Surge457. #SynBio axlr8r458. #TEB Incubation & Acceleration Center459. #THRIVE Accelerator III460. #THRIVE Open Innovation(DUPLICATE)461. #TIM#WCAP Accelerator462. #TLabs463. #TMCx Accelerator Digital Health 2017464. #Tallwave465. #Tampa Bay Innovation Center466. #Tampa Bay Wave467. Tandem468. #Tandem Mobile Accelerator469. Target India Accelerator470. #Tech Nexus471. #Tech Wildcatters472. Tech Wildcatters Gauntlet 473. #Tech2020474. #TechLaunch475. #TechRanch476. #TechSquareLabs477. #Techstars478. #Techstars Music479. Techstars Music Accelerator 17480. Telenet #Telenet Idealabs481. #Telluride Venture Accelerator482. Telluride Venture Accelerator 2017483. #TenX484. The ARK Challenge485. #The Alchemist Accelerator(DUPLICATE)486. #The Ark487. #The Bakery488. #The Batchery489. #The Brandery490. #The Bridge491. #The Center For Technology Enterprise & Development492. #The Chaser493. #The Company Lab (CO.LAB)494. #The Draper FinTech Connection495. #The Factory496. #The Greatest Pitch497. #The Harbor Accelerator498. #The Incubator499. #The Iron Yard500. #The Mediapreneur Incubator501. #The Morpheus502. #The New York Venture Summit503. #The Next Step: from idea to startup504. #The Pool Co Working Space505. The Refiners Application506. The Refinery507. Refinery#The Unilever Foundry, Pilot508. #The Venture Center's Pre-Accelerator I509. #The Vine OC510. #The Vogt Awards511. #The Yield Lab512. #The eFactory Accelerator513. The eFactory Accelerator Spring 2017514. #Think Big Partners 2013 Application Accelerator515. Think Big Partners Accelerator516. #TiE Angels517. #Tigerlabs Digital Health Accelerator518. Tigerlabs Health519. #Tolstoy Summer Camp520. #TopSeedsLab521. #Travel Startups Incubator522. #Travelport Labs Accelerator523. #Travelport Labs Incubator524. #Triangle Startup Factory525. #Tumml526. #Tune Labs527. #Twin Cities Accelerator 2016528. UCIS B2B Matchmaking529. US Startups 2017530. #UW-Whitewater Launch Pad Accelerator531. Umbono532. #Unbank.ventures FinTech Incubator533. #University Technology Park534. #Unreasonable Institute535. #UpTech536. #Upstart Accelerator537. Upstart Accelerator 2017538. #Upstart Labs539. #Upstart Memphis540. #Uptima Business Bootcamp541. #Upwest Labs542. #VANTEC543. #VC FinTech Accelerator544. VSL FinTech Rolling Admission545. #Velocity Indiana Accelerator546. Velocity #Venture CatalystPartners547. #Venture Hive548. #Venture I549. #VentureOut's Enterprise Tech Expedition550. VentureTech.net551. #Venturegeeks552. #Vet-Tech Accelerator553. VetTechTrek554. #VictorySpark555. Village Capital556. Village Cultivators557. Village Member Discounts558. Village Verified559. Village88 #Village88 Techlab560. Virtual Incubator & Crowdfunding Network561. #Volkswagen ERL Technology Accelerator562. #WHLabs563. #Wasabi Ventures Academy564. #Wayra565. #Wellness Accelerator566. #Wells Fargo Startup Accelerator567. #Wireless IoT568. #Women Innovate Mobile569. XLR8HI#XLerateHealth570. XLerateHealth571. #XTRATOS572. Xcelerate573. #Xlerate Health574. #Y Combinator575. #Y&R SparkPlug 2017576. #YEurope577. #YLE Media Startup Accelerator Program578. #Yahoo Ad Tech Program579. #Yangler (online accelerator)580. #Year of the Startup581. #Yetizen Accelerator582. #You Is Now583. #Z80 Labs584. #ZIP Launchpad Admission585. #ZeroTo510586. #Zone Startups Calgary587. #designX 2017588. #eMerging Ventures589. #ezone590. gener8tor591. i360accelerator592. iAccelerator593. #iStart Jax(DUPLICATE)594. #iStart Valley595. #iVentures10596. #ignite100597. #innovyz start598. #tekMountain Accelerator
==Project Summary==
This project will be used to determine which accelerators are the most effective at churning out successful startups, as well as what characteristics are exhibited by these accelerators. First, we need to gather as much data as we can about as many accelerators as we can in order to look at factors that differentiate successful vs. unsuccessful ventures. Next, we need to create a web crawling program which will gather information about accelerators across the world by accessing their websites and extracting information. I believe that our overall goal with this research project is to gain insight into the methods of successful accelerators, as well as to find out what exactly differentiates very successful accelerators from dead accelerators.
=Sources=
Summary: These are sources obtained from [[List of Accelerators]] , Crunchbase, and other Google searches. We will evaluate these sources by looking at the number of accelerators they supply (as most of them are lists) and then also taking a look at the type of information they provide about each accelerator. Key data points are cohort-related data, startup-related data, and logistics of the accelerator. Better sources supply more information that the URL alone.
(Obtained from [[List of Accelerators]] and various Google searches)
*http://www.represent.la/
*http://www.launch.co/blog/complete-list-of-incubators-and-accelerators-like-y-combinat.html
*https://angel.co/accelerator-4(Does not work - seems to be replaced by https://angel.co/companies?company_types[]=Incubator )
(Obtained from Google search: "Accelerator Database")
*Type in a specific state + "accelerator" + "list" (e.g. Texas accelerator list) to search for more relevant lists
:*Once again, looked at roughly the first 20 results
*Crunchbase has its own webpage with instructions for how we retrieve the data
=Source Evaluations=
Summary: These evaluations couple with each of the sources above. The evaluations provide instructions for obtaining the information listed, as well as a general review of how useful the data seems. The review serves to determine whether a crawler would be suitable for obtaining information from the source autonomously.
 
==SOURCE: Crunchbase==
*All of the information for the Crunchbase documentation is located in the page [[Crunchbase 2013 Snapshot]] webpage, along with the documentation for how we determined the accelerator information.
==Source: http://www.acceleratorinfo.com/see-all.html==
==Source: http://www.seed-db.com/accelerators/all==
#Copied "Seed Accelerators" table to TextPad, data sorted itself into lines. Returned 235 results.
#Clicking on the accelerator name itself links to a page with all of its associated startups, up until 6/2016 cohort
*Overall very extensive data for accelerators that are included on the list, but after cross-referencing from other sources shows that seed-db is lacking many newer accelerators; list is not all-inclusive.
*Includes regional distributions for accelerator groups as well. For example, rather than just "Techstars", the group is broken into Austin, Berlin, Boston, Boulder, etc.
 
==Source: http://www.seed-db.com/accelerators==
*Examples of single accelerators found
:#TMCx: http://www.tmc.edu/innovation/innovation-programs/tmcx/
:#RED labs: http://redlabs.uh.edu/8
:#SURGE accelerator: https://kirkcoburn.com/
:#OwlSpark: http://owlspark.com/
:#NextHIT: http://www.houstonhealthventures.com/nexthit-accelerator-program-application/
 
===Los Angeles Accelerators===
:#Amplify: http://amplify.la/
*Many results returned nationally ranked lists of accelerators, such as the Forbes list of "Top Accelerators" or something along the lines of "Best Accelerators in the US". The connection is that perhaps one accelerator mentioned on the list may be located within the searched state.
*There are also a few results for actual particle accelerators that must be sorted out (i.e. superconducting super collider)
 
==Found through google searching accelerators found previously==
'''Found from googling YLE Media Startup Accelerator'''
*https://www.corporate-accelerators.net/database/index.html (DB of Corporate Accelerators 71-79 entries)
*http://startupaccelerator.vc/accelerator-corporate-innovation-sig/ (Database of Accelerators and Corporate Innovation 92 entries)
neither of these have had their entries added to list of accelerators
=Individual Accelerator Evaluations=
F6S has an API, but we have had no success getting a key to the API. The link to get a key to the API is on [https://www.f6s.com/developers/apis/deal-feed this page].
I (Peter) have emailed F6S to ask for a key directly at support@f6s.com. As of the end of the Fall 2016 Semester, they have not responded.
FUN FACT (MASS-RENAME FILES USING WINDOWS POWER SHELL):
To change file formats, Microsoft suggests:
Get-ChildItem *.txt | Rename-Item -NewName { $_.name -Replace '\.txt', '.log'}
 
==Final Data==
The Parser for parsing the text files of accelerator data is located in:
E:\McNair\Projects\Accelerators\Code+Final_Data
 
The Parser for parsing the cohort files of accelerator data is also located in:
E:\McNair\Projects\Accelerators\Code+Final_Data
 
This folder contains the Python parsers. The Final_data folder contains the tab-delimited text files of parsed data. final_accelerator_data.txt contains the generalized data saved in .txt files and final_cohort_data.txt contains the cohort data saved in .cohort.txt files.
 
All the files entitled accelerator_data are subsets of the final_accelerator_data.txt file, but each file contains only the accelerators that matched to the flag specified in the file title.
 
find_headers .py finds a set of the headers for all the cohort files from the seed list project.
 
==Google SiteSearch==
E:\McNair\Projects\Accelerators\Google_SiteSearch
This folder contains code for a google search parser. The script sitesearch.py will search for a queried company and return a likely web address for that company.
 
==Way Back Machine Parser==
E:\McNair\Projects\Accelerators\Code+Final_Data\wayback_machine.py
This script takes URLs and returns a timestamp for the oldest documented webpage under that URL courtesy of the Way Back Machine Archive.
 
==Process Locations==
E:\McNair\Projects\Accelerators\Code+Final_Data\process_locations.py
This script takes a physical address and converts it into latitude and longitude coordinates. Should be used in conjunction with the Enclosing Circle program to find the concentration of accelerators.
E:\McNair\Software\CodeBase\EnclosingCircle.py
 
=Kauffman Foundation Incubator Proposal Information=
 
==Institutions==
Summary: F6S, Crunchbase, seed-db
 
Tools: Matcher - used to match lists of potential accelerators with our current list to identify duplicates/new matches (E:\McNair\Projects\Accelerators)
 
===F6S===
F6S WebCrawler and F6S Parser - E:\McNair\Projects\Accelerators\F6S Accelerator HTMLs
 
===CrunchBase===
 
CrunchBase 2013 Snapshot '''(All Organizations)'''- E:\McNair\Projects\Accelerators\organizations.xls
 
CrunchBase 2013 Snapshot '''(Potential Accelerators)'''- E:\McNair\Projects\Accelerators\organizations.accdb under "Potential Accelerators query"
 
*Obtained using keyword matches in the descriptions of the potential accelerators.
 
CrunchBase 2013 Snapshot '''(New Verified Accelerators)''' - E:\McNair\Projects\Accelerators\New CrunchBase Accelerators.xls
 
We have the Crunchbase 2013 Snapshot which provided lots of new data on accelerators and incubators but we would love to use the Crunchbase API to get a current database snapshot that we could use to cross reference companies and add newly formed accelerator and incubator companies.
 
===AngelList===
 
===seed-db===
 
Obtained through www.seed.db/accelerators
 
===Global Accelerator Network (GAN)===
 
GAN Parser- E:\McNair\Projects\Accelerators\Web Scraping for Accelerators\scrapeaccel.py
 
GAN Data- E:\McNair\Projects\Accelerators\Web Scraping for Accelerators\GAN Accelerator Data
*Contains: Company Name, # of Companies Range, % of Companies Funded, Funding Raised by Companies, Employee Range, Exit Funding, Exit Date, Total Company Funding Raised, # of Mentors Range, % Equity, Location, Minimum Seed Capital Investment
 
==Cohorts==
 
*Cohorts obtained manually
*All Cohort txt files are saved under "E:\McNair\Projects\Accelerators\Data
*cohort file name = (accelerator name).cohort
*Most updated Accelerator cohort data: E:\McNair\Projects\Accelerators\Cleaned Cohort Data.xls
 
Automation for obtaining cohorts??
 
==Other Information==
Summary: Whois Parser, Geocode, Tools to determine industry, etc
 
===Whois Parser===
 
*Retrieves and parses Whois information. Specifically, takes a file with a column of domain names and populates the corresponding columns with information from the WhoIs API.
 
*Often used to obtain locations.
 
===Geocode===
 
Input: Company Address
Output: Directional Coordinates
 
*Used to obtain the locations of different Accelerators and Cohort companies.
 
===SDC Platinum Pull===
 
Used to obtain funding information and match companies that have gotten funding with companies that are Accelerator cohorts.
 
===Desired Information/Variables===
 
*Key People (founders, lead entrepreneurs, strategists, etc.)
*Total number of launched companies
*A FAQ for application details, accelerator vision, and
*Funds raised per company (average)
*Features offered by accelerator (perks, space, tools, etc)
 
==Desired Tools/Information==
 
===Automating the Process of Obtaining Cohorts===
*Automating this process would save a lot of time and really progress the project.
 
===Obtaining More Details on Accelerators===
 
*Having the kind of thorough information on industry, companies, funding, location, exits, mentors, leadership, that we got for the GAN companies would be fantastic.
 
===List of Alive/Dead Accelerators===
 
This is a dream but would be very helpful

Navigation menu