Difference between revisions of "The Impact of Entrepreneurship Hubs on Urban Venture Capital Investment"

From edegan.com
Jump to navigation Jump to search
 
(48 intermediate revisions by 6 users not shown)
Line 1: Line 1:
{{McNair Projects
+
{{AcademicPaper
|Project Title=Hubs(Academic Paper)
+
|Has title=The Impact of Entrepreneurship Hubs on Urban Venture Capital Investment
|Topic Area=Entrepreneurship Ecosystems
+
|Has author=Ed Egan, Yael Hochberg
|Owner=Todd Rachowin, Ariel Sun
+
|Has RAs=Hira Farooqi,
|Start Term=Spring 2016
+
|Has paper status=Tabled
|Status=Active
 
|Deliverable=Academic Paper
 
|Audience=Academics
 
|Keywords=Hubs, Incubators, Accelerators, Venture, Capital, Angel, Investor, Startups
 
|Primary Billing=AccNBER01
 
 
}}
 
}}
=Abstract=
+
=Hubs Pages=
 +
*This page [[Hubs (Academic Paper)]] contains only the abstract and some useful refs
 +
*The main [[Hubs]] page is the place to go!
 +
*There is also [[Old Completed Work on Hubs]]
 +
*For a high-level overview of the variables for the scorecard go to [[Hubs Scorecard (Academic Paper)]]. This summarizes:
 +
**Current work in progress for building the Hubs scorecard: [[Hubs: Hubs Scorecard]]
 +
**Tracking of work in progress for the scorecard [[Hubs: Hubs Data Building]]
  
The Hubs Research Project is a full-length academic paper analyzing the effectiveness of "hubs", a component of the entrepreneurship ecosystem, in the advancement and growth of entrepreneurial success in a metropolitan area.
 
  
This research will primarily be focused on large and mid-sized Metropolitan Statistical Areas (MSAs), as that is where the greater majority of Venture Capital funding is located.
+
==Abstract==
  
A general overview of entrepreneurial ecosystems can be found here: [[Entrepreneurial Ecosystem]].
+
Entrepreneurship hubs have recently emerged as a stable institutional form and as popular and important components of entrepreneurship ecosystems. Hubs are membership-based co-working flex-spaces with specialized services and resources for nascent start-up firms. Examples of hubs include the Capital Factory in Austin, Texas, 1871 in Chicago, Illinois, and 1776 in Philadelphia, Pennsylvania. Each of these hubs has around 50,000sqft of workspace for almost a thousand members working at hundreds of start-ups. Each also includes an accelerator program, has daily events, classes and meetings related to entrepreneurship, and hosts venture capitalists, angel investors, and service firms.
 +
Hubs provide a very high degree of agglomeration. Agglomeration is particularly important in entrepreneurship because it facilitates learning and failure is frequent. Entrepreneurs can then learn from other entrepreneurs as well as industry professionals; and when a start-up based in a hub fails, the firm’s human resources can be quickly and efficiently absorbed into another venture. We might therefore expect that the introduction of a hub will lead to a greater degree of entrepreneurial activity in a region.
 +
This paper will use a difference-in-difference approach to estimate the effect of the introduction of a hub on seed and early stage venture capital investment in an area. The empirical methodology of the paper is closely aligned with the methodology in Fedher and Hochberg (2015). The decision of a hub to locate itself in an area is expected to be highly correlated with existing characteristics of the area, unobserved in the data, which induces a significant endogeneity bias in the model. To rectify this issue the methodology proceeds in two steps. In the first step, a hazard model is estimated which predicts the probability that a hub will come to an area. In the second stage these predicted probabilities are used to find a match for each treated region by finding the untreated region with the most similar probability of founding an accelerator in that year when the treated region is on the common support.
  
 +
==Current Work==
  
=Current Work=
+
===General Overview===
==General Overview==
+
 
Currently there are two major tasks being performed (list to be updated):
+
Currently there are '''3''' major tasks being performed (list to be updated):
#'''Creation of VC data table''': '''UPDATE: Complete''' (see completed work)
+
#'''Creation of VC data table''': '''UPDATE: Complete''' (see completed work section below)
#'''Creation of Hubs Dataset''': '''UPDATE: See current work in progress for updates''' As of Spring 2016, a list of potential Hubs with a set of characteristics was created.  Many of these are not what will be defined as Hubs.  We will be creating a scorecard to help subjectively define Hubs based on certain characteristics. To do so:
+
#'''Creation of Hubs Dataset''': '''UPDATE: See current work in progress for updates''' We will collect key variables for potential Hubs.
##We will determine variables we would like to use for scorecard
 
##Create a process via Mechanical Turk to streamline the updating of the list
 
 
#'''Hazard Rate Model''': '''UPDATE: (7/11) Spoke to Xun Tang, econometrics professor in Rice's Economics Department, and now looking for appropriate proportional rate hazard models with time varying covariates.''' In order to perform our diff-diff model, we need to match MSAs.  In order to do so, we will be using a hazard rate model to produce a probability that a MSA gets a Hub and compare MSAs that do and don't have hubs with similar probabilities.
 
#'''Hazard Rate Model''': '''UPDATE: (7/11) Spoke to Xun Tang, econometrics professor in Rice's Economics Department, and now looking for appropriate proportional rate hazard models with time varying covariates.''' In order to perform our diff-diff model, we need to match MSAs.  In order to do so, we will be using a hazard rate model to produce a probability that a MSA gets a Hub and compare MSAs that do and don't have hubs with similar probabilities.
  
==Work In Progress==
 
===Hubs Data===
 
'''(Week of 7/11)'''
 
 
1) We published the twitter count on mechanical turk and received results.
 
 
2) We have audited the results and updated the amazon.
 
 
3) We are creating additional potential turks on the amazon site (See [[Hubs: Mechanical turk]])
 
 
4) We are finding more potential hubs from members of international national business innovation association
 
 
 
'''(Week of 7/4)'''
 
 
1) We have created the list and commented our thoughts after ---. For determining the variables, we have separated the list into two parts: a list of desired variables and ones that were previously collected, many of which are desired variables.
 
 
2) We have also created an example of how to write mechanical turks for collecting certain variables
 
 
===Variables for Hubs===
 
We will be creating a "Hubs scorecard" to determine how hub-like potential spaces are.  In order to do so, we will evaluate the places based on certain variables.  Previous variables for potential hubs were collected.  Below, we list those as well as other variables we think might be helpful to build out the scorecard.
 
 
Ideally, we would have the following variables (not collected previously):
 
 
*Onsite VC/Angel/Investors (Count or binary)
 
*Onsite Mentors (binary) --- ''Are these the same as advisers?''
 
*"Office hours" with investors or mentors (binary) --- ''note: previously collected included number of events, but did not separate them into categories (e.g. networking events, workshops, etc.).  We view this separation as important''
 
*Onsite temporary workshops (binary or count)  *** '''see mechanical turk'''
 
*Networking Meetups (Binary or count) *** '''see mechanical turk'''
 
*Sponsors and Partners (binary and list) --- a''re these the same?''
 
*Alumni Network (binary)  --- ''not all potential hubslist this and the fact that some do might indicate its importance''
 
*Num of Companies --- ''to help determine size as getting physical sqfootage is difficult''
 
*Nonprofit (binary)  --- ''helpful in determining goals of potential hubs''
 
*Mission Includes Key Buzzwords (e.g. "ecosystem", "community")  --- ''help separate simple coworking spaces form hubs''
 
 
 
 
'''Group by difficulty level for Turks:'''
 
*Easy
 
**Twitter Activity
 
**Non-profit
 
**Mission Includes Key Buzzwords
 
 
*Moderate
 
**Number of Companies
 
**Sponsors and Partners
 
**Number of Events (including onsite temporary workshops and networking meetups)
 
**Onsite Mentors
 
 
*Hard
 
**“Office hours” with investors or mentors
 
**Onsite VC/Angel/Invesotors
 
**Temporary Workshops
 
**Networking Meetups
 
 
 
Example of Prior Variables Collected:
 
*Specific Industry -- ''defined as LinkedIN Self Identifier, no categories just plain text.  We think what we really want is to see if they have a specialty (e.g. healthcare)''
 
*Num of Events --- ''relatively complete inputs, but from March 2016 (see above as well)''
 
*Price for Single Space --- ''defined as price for flexible desk, relatively complete inputs''
 
*Price for Office --- ''no inputs''
 
*Twitter Activity (Multinomial or Count) --- ''High=2/Moderate=1/No=0, no explanations on how to categorize the activity. Also no handles''
 
*Size (sqft) --- ''no records for majority of the companies''
 
*Num Conference Rooms --- ''no records for majority of the companies''
 
*Onsite accelerator (binary) --- ''relatively complete inputs''
 
*Onsite code school (binary) --- ''relatively complete inputs''
 
*Community Membership (binary) --- ''relatively complete inputs''
 
 
===Mechanical Turk===
 
We have created a page to see our actual work with mechnical turks for this paper here: [[Hubs: Mechanical Turk]].
 
 
For more general information on Mechanical Turk, go here: [[Mechanical Turk (Tool)]].
 
 
*'''Twitter activity''': ''
 
'''UPDATE (7/14)''': Updated turk to reflect our desired formats
 
 
'''UPDATE (7/12)''': '''AUDIT RESULTS''': We noticed
 
 
'''UPDATE (7/11)''': uploaded and published on amazon's mechanical turk site.  Given the time cost to either record number of tweets in a month or look up more than 10 tweets, we decided to record the date of the last 10th tweet.  Using a sample of ~10 companies, We noticed minimal differences in data observations among using 10,20, and 30 tweets.''
 
#Copy the text in the Search Text into a search engine.
 
#Click on result from twitter.com with the company name. If the link does not appear on the first 3 pages, record DNE for both outputs
 
#Record the company's Twitter Handle into Twitter Handle
 
#Record the date (MM/DD/YY) of that tweet for Twitter Activity. If there are less than 10 tweets, record DNE.
 
 
 
*'''Nonprofit''': ''UPDATE: written, not published, on amazon's mechanical turk site''
 
 
'''Considerations'''
 
*Difficulties Encountered:
 
*Expected Time to Complete:
 
*Expectation of Results (accuracy of turk, comprehensiveness):
 
*Other Comments:
 
 
'''Procedure'''
 
#Copy the text in the Search Text into a search engine.
 
#Click on the result that is the website of the company. If there does not exist a listing on the first three pages, mark as DNE.
 
#Go to links that describe the company, usually they are labelled: 'About', 'Our Story,' 'Mission'
 
#Look for the key word 'nonprofit'/'non-profit'
 
#If 'nonprofit' is identified, mark as 1, otherwise 0.
 
 
 
*'''Key Buzzword'''
 
'''Considerations'''
 
*Difficulties Encountered:
 
*Expected Time to Complete:
 
*Expectation of Results (accuracy of turk, comprehensiveness):
 
*Other Comments: can be combined with nonprofit to the same turk assignment
 
 
'''Procedure'''
 
#Copy the text in the Search Text into a search engine.
 
#Click on the result that is the website of the company. If there does not exist a listing on the first three pages, mark as DNE.
 
#Go to links that describe the company, usually they are labelled: 'About', 'Our Story,' 'Mission'
 
#Look for the key word 'entrepreneurial ecosystem','startup/technology hub', 'community'
 
#If any of the key words is identified, mark as 1, otherwise 0.
 
 
 
*'''NUMBER OF EVENTS''':
 
 
'' '''UPDATE (7/13)''': written, not published, on amazon's mechanical turk site''
 
 
'''Considerations'''
 
*Difficulties Encountered:Hard to separate different types of events: workshops, info sessions, meet ups, etc. Most companies put all the events on the same section and do not put event types in the titles of the events. We have to look into the details of the events to find out the type and even we do so some events descriptions do not allow us to determine the type easily. So we will count them all as 'events'.
 
*Expected Time to Complete:
 
*Expectation of Results (accuracy of turk, comprehensiveness):
 
*Other Comments:
 
 
'''Procedure'''
 
#Copy the text in the Search Text into a search engine.
 
#Click on the result that is the website of the company. If there does not exist a listing on the first three pages, mark as DNE.
 
#Look for links related to events, such as 'Events' or 'Calendar' on the homepage.
 
#If not found on the homepage, check 'About' and check 'Community'
 
#Count the number of events in July 2016 and record it. If there is no information of events on the website, record DNE.
 
 
Note***: ''Events include meetups, workshops, info sessions etc. We do not want to count them separately since it is difficult to do so. Most companies put all the events on the same section and do not put event types in the titles of the events. We have to look into the details of the events to find out the type and even we do so some events descriptions do not allow us to determine the type easily. Differentiating the types of the events demands more time and effort and therefore is not suitable to be a mechanical turk project.''
 
 
 
 
 
 
 
 
 
 
 
*'''Onsite Mentors''': ''UPDATE: written, not published, on amazon's mechanical turk site''
 
'''Considerations'''
 
*Difficulties Encountered:Companies put the information about mentors or mentoring programs in very different places. Some have a specific link or section for mentors/mentoring programs, some put them as a sub-section under 'About' or 'Our Team', others may put them under membership 'benefit' or 'perks'.
 
*Expected Time to Complete: 10 - 40 seconds
 
*Expectation of Results (accuracy of turk, comprehensiveness):
 
*Other Comments: Some companies give more detailed information about mentors or mentoring programs, but some only mention them in one line.  Do we need to treat them differently?
 
 
'''Procedure'''
 
#Copy the text in the Search Text into a search engine.
 
#Click on the result that is the website of the company. If there does not exist a listing on the first three pages, mark as DNE.
 
#Look for links related to mentorship such as 'mentors', 'mentorship' or 'mentoring programs'
 
#If the key words can be identified, mark as 1
 
#If there is no explicit 'mentoring' section, look for links related to a description of the company, such as: 'About,' 'Our Team,' 'Our Mission,' etc., look for a subsection or mention of mentor/mentorship/mentoring
 
#If these exist, mark as 1.
 
#If not, go to links related to membership 'benefits,' 'perks,' or related.
 
#Do same process as end of 4 and 5
 
#If there is no mention of mentorship in these sections, type the company, city, and 'mentoring' into a search engine.  If a link to a reliable website (such as Desktime) appears and mentorship can be found in the description, mark as 1.
 
#If none of these steps result in a mark of 1, mark as 0
 
 
 
 
 
 
*'''Number of Members''': ''UPDATE: written, not published, on amazon's mechanical turk site''
 
 
'''Considerations'''
 
*Difficulties Encountered: Some companies don’t list all members but only selective ones. Some companies do not separate current members and alumni and goes like:"we have served more than 120 startups..."
 
*Expected Time to Complete:
 
*Expectation of Results (accuracy of turk, comprehensiveness):
 
*Other Comments:
 
 
'''Procedure'''
 
#Copy the text in the Search Text into a search engine.
 
#Click on the result that is the website of the company. If there does not exist a listing on the first three pages, mark as DNE.
 
#Look for the link 'Members' or 'Residents', usually they are under the links 'Community', 'Membership', 'Our Space' or 'The Space'.
 
#Count the number of members
 
#If the link or section of 'Members' is not found, go the 'Community' and 'Coworking' and look for the description on number of startups/founders/members in the community. Record the number.
 
#If number of members cannot be identified using above steps, record DNE.
 
  
 
+
==Resources==
*'''Sponsors and Partners''':''UPDATE: written, not published, on amazon's mechanical turk site''
 
'''Considerations'''
 
*Difficulties Encountered: The company put some other companies' logos on its webpage(usually at the bottom of the homepage) such as Google and Amazon without saying that they are the sponsors or partners
 
*Expected Time to Complete:
 
*Expectation of Results (accuracy of turk, comprehensiveness):
 
*Other Comments:
 
 
 
'''Procedure'''
 
#Copy the text in the Search Text into a search engine.
 
#Click on the result that is the website of the company. If there does not exist a listing on the first three pages, mark as DNE.
 
#Look for the link or mention of 'Sponsors','Partners' or 'Supporters', many times of which is often under the section of 'About', 'Community', or related sections
 
#If sponsors or partners can be found mark as 1 and list them, otherwise mark as 0.
 
 
 
=Completed Work=
 
==Venture Capital Data General Overview==
 
The main goal of the data set is to aggregate company, fund, and round level data to be analyzed at a combined MSA and year level. The data set is compromised of two major parts: a granular company/fund/round and an aggregated CMSA-Year.  The data includes all United States Venture Capital transactions (moneytree) from the twenty-five year period of 1990 through 2015.
 
 
 
The Hubs data set, from SDC Platinum, has been constructed in the server:
 
Data files are in 128.42.44.181/bulk/Hubs
 
All files are in 128.42.44.182/bulk/Projects/Hubs
 
psql Hubs2
 
 
 
See the server for the code and ~1st 5 rows of each table
 
 
 
===Procedure - Granular Table===
 
#Start with separate raw datasets for Companies, Funds, and Rounds
 
#Add Data to Each Individual dataset (e.g. add MSA code)
 
#Clean and standardize names (e.g. company or fund name) for each dataset
 
#Join the Datasets (here we need to exclude undisclosed companies)
 
 
 
===Procedure - CMSA-Year Table===
 
#Create a consistent CMSA-Year table to be used later
 
#Using the tables from the granular table, parse out the right data
 
#Join the parsed out data with the CMSA-Year Table
 
#Join these Tables
 
 
 
==VC Specific Tables and Procedure==
 
===Raw data tables===
 
#'''Funds''': fund name, first investment date, last investment date, fund closing date, address, known investment, average investment, number of companies invested, MSA, MSA code.
 
#'''Rounds''': round date, company name, state, round number, stage 1, stage 2, stage 3
 
#'''Combined Rounds''': company name, round date, disclosed amount, investor
 
#'''Companies''': company name, first investment, last investment, MSA, MSA code, address, state, date founded, known funding, industry
 
#'''MSA List''': MSA, MSA code, CMSA, CMSA code
 
#'''Industry List''': changes 6 industry categories to 4— ICT, Life Sciences, Semiconductors, Other
 
 
 
 
 
===Granular Table (Fund-Round-Company)===
 
The final table here contains all venture capital transactions by disclosed funds and portfolio companies, together with their CMSAs.
 
To get the table, we processed the raw data sets in the following steps:
 
#Clean '''Company''' data
 
##Import raw data companies
 
##Add variable 'CMSA' from data set MSA list, update variable 'industry' by joining data set industry list
 
##Remove duplicates and remove undisclosed companies
 
#Clean '''Fund''' data
 
##Import raw data funds
 
##Add variable 'CMSA'
 
##Remove duplicates and remove undisclosed funds
 
##Match fund names with itself using [The Matcher (Tool) |The Matcher] to get the standard fund names
 
#Clean '''Round''' data
 
##Import raw data rounds and combined rounds
 
##Add variables 'number of investment', 'estimated investment' and 'year'
 
##Remove duplicates and remove undisclosed funds
 
#'''Combine''' '''Companies''' and '''Rounds'''
 
##Combine cleaned companies and rounds data table on company names
 
##Add variable 'round number' and 'stage'
 
##Remove duplicates
 
#'''Combine''' '''Funds''' and '''rounds-companies'''
 
##Match fund names in rounds data table with standard fund names using [The Matcher (Tool) |The Matcher] to standardize fund names in rounds data table
 
##Join standard fund names to rounds-companies table
 
##Join cleaned funds table to rounds-companies table on standard fund names
 
 
 
 
 
===CMSA-Year Aggregated Table===
 
The final table contains number of companies and amount of investment, categorized by distance and stages, of each CMSA.
 
 
 
We processed data as follows:
 
#Create the '''CMSA-Year''' Table
 
##Create single variable tables: Distinct CMSA, year, stage, found year of fund and found year of company.
 
##Create the cross production tables: CMSA-year, CMSA-year-fund year founded and CMSA-year-company year founded
 
#Draw data from cleaned companies, funds and rounds tables
 
##Create a table with 'CMSA', 'number of companies' and 'year Founded' from cleaned companies table and join it to CMSA -year founded
 
##Create a table with 'Company CMSA', 'round year', 'disclosed amount' from rounds-companies combined table, and add stage binary variables. Join it to CMSA-year-company year founded
 
##Create a table with 'CMSA', 'fund year', 'number of investors' from cleaned funds table and join it to CMSA-year-fund year founded
 
#Create '''near-far''' and stages table
 
##Add fund data to rounds-companies
 
##Create near-far and stages binary variable
 
##Count investment and deals by CMSA and year, categorized by near-far and stages
 
#Combine all tables by CMSA and round-year
 
 
 
==Supplementary Data Sets==
 
 
 
Supplementary data sets are cleaned and joined back to CMSAyear table on CMSA and year:
 
 
 
#Number of STEM graduate student, by university and year(2005 to 2014).
 
#University R&D spending, by university and year(2004 to 2014).
 
#Income per capital, by MSA and year(2000 to 2012)
 
#Wages and salaries, by MSA and year(2000 to 2012)
 
 
 
 
 
The datasets can respectively be found at:
 
E:\McNair\Projects\Hubs\STEM grads for upload v2.xls
 
E:\McNair\Projects\Hubs\NSF spending for upload.xls
 
E:\McNair\Projects\Hubs\Income per capita upload.xls
 
E:\McNair\Projects\Hubs\Wage for upload v2.xls
 
 
 
=Resources=
 
  
 
===Additional Resources===
 
===Additional Resources===
 +
* A general overview of entrepreneurial ecosystems can be found here: [[Entrepreneurial Ecosystem]].
 
* Yael Hochberg and Fehder (2015), located in dropbox
 
* Yael Hochberg and Fehder (2015), located in dropbox
 
** Use this paper as a guideline on how to conduct the analysis
 
** Use this paper as a guideline on how to conduct the analysis
Line 325: Line 39:
 
*USPTO utility patents by MSA: http://www.uspto.gov/web/offices/ac/ido/oeip/taf/cls_cbsa/allcbsa_gd.htm
 
*USPTO utility patents by MSA: http://www.uspto.gov/web/offices/ac/ido/oeip/taf/cls_cbsa/allcbsa_gd.htm
 
*MSA level trends: http://www.metrotrends.org/data.cf
 
*MSA level trends: http://www.metrotrends.org/data.cf
 
 
 
 
<includeonly>
 
[[Category: McNair Projects]]
 
</includeonly><!-- flush -->
 

Latest revision as of 10:56, 18 March 2019

Academic Paper
Title The Impact of Entrepreneurship Hubs on Urban Venture Capital Investment
Author Ed Egan, Yael Hochberg
RAs Hira Farooqi
Status Tabled
© edegan.com, 2016

Hubs Pages


Abstract

Entrepreneurship hubs have recently emerged as a stable institutional form and as popular and important components of entrepreneurship ecosystems. Hubs are membership-based co-working flex-spaces with specialized services and resources for nascent start-up firms. Examples of hubs include the Capital Factory in Austin, Texas, 1871 in Chicago, Illinois, and 1776 in Philadelphia, Pennsylvania. Each of these hubs has around 50,000sqft of workspace for almost a thousand members working at hundreds of start-ups. Each also includes an accelerator program, has daily events, classes and meetings related to entrepreneurship, and hosts venture capitalists, angel investors, and service firms. Hubs provide a very high degree of agglomeration. Agglomeration is particularly important in entrepreneurship because it facilitates learning and failure is frequent. Entrepreneurs can then learn from other entrepreneurs as well as industry professionals; and when a start-up based in a hub fails, the firm’s human resources can be quickly and efficiently absorbed into another venture. We might therefore expect that the introduction of a hub will lead to a greater degree of entrepreneurial activity in a region. This paper will use a difference-in-difference approach to estimate the effect of the introduction of a hub on seed and early stage venture capital investment in an area. The empirical methodology of the paper is closely aligned with the methodology in Fedher and Hochberg (2015). The decision of a hub to locate itself in an area is expected to be highly correlated with existing characteristics of the area, unobserved in the data, which induces a significant endogeneity bias in the model. To rectify this issue the methodology proceeds in two steps. In the first step, a hazard model is estimated which predicts the probability that a hub will come to an area. In the second stage these predicted probabilities are used to find a match for each treated region by finding the untreated region with the most similar probability of founding an accelerator in that year when the treated region is on the common support.

Current Work

General Overview

Currently there are 3 major tasks being performed (list to be updated):

  1. Creation of VC data table: UPDATE: Complete (see completed work section below)
  2. Creation of Hubs Dataset: UPDATE: See current work in progress for updates We will collect key variables for potential Hubs.
  3. Hazard Rate Model: UPDATE: (7/11) Spoke to Xun Tang, econometrics professor in Rice's Economics Department, and now looking for appropriate proportional rate hazard models with time varying covariates. In order to perform our diff-diff model, we need to match MSAs. In order to do so, we will be using a hazard rate model to produce a probability that a MSA gets a Hub and compare MSAs that do and don't have hubs with similar probabilities.


Resources

Additional Resources