Changes

Jump to navigation Jump to search
===Heuristic Layer===
[[File:AgglomerationInflectionScatterPlotAllDataCircles.png|500px|right]] I had previously calculated the heuristic layer by calculating the mean fracinhull (i.e., % of startups in economic clusters) for each percentage of the layer index (i.e., for 101 observations) and then fitting a cubic to it. I did this because excel can't handle fitting a cubic to the full data (i.e., all 148,556 city-year-layers). However, it is incorrect - I should have used the median fracinhull, and even that would have been slightly wrong because of orthogonality issues in calculating mean square distances(I'm also unsure that the mean would be the best measure of central tendency). So I redid the plot using all the data, and calculated the cubic in STATA instead. See: '''inflection.do''' and '''inflection.log'''.
The old result is in [https://www.edegan.com/wiki/Urban_Start-up_Agglomeration_and_Venture_Capital_Investment#Fixing_an_issue Fixing an issue] below, and is x≈0.483879. The corrected result is x≈0.487717 (note that R2 has dropped to 92.43%):
One complaint made about the heuristic results is that it is near the middle (i.e., it's 48.7717%, which happens to be near 50%). Although the nature of any HCA on geographic coords implies that the result is unlikely to the close to the bounds (0 or 100%) and more likely to be near the middle (50%), it could be in an entirely different place. '''This result (i.e., the heuristic layer at 48.7717%) characterizes the agglomeration of venture-backed startup firms'''. You'd get a very different number if you studied gas stations, supermarkets, airports, or banana plantations!
 
====Comparing the Heuristic and R2 Layers====
{{Colored box|title=The Case for the Heuristic Method|content=The heuristic method (i.e., using the inflection in the plot from the population of city-year-layers) finds pretty much the same layer as the R2 method with almost no work, and it can be used in a within-city analysis without having to hold hull count constant.}}
numstartups | 15 38.74743 83.6814 3797 6 1317 7 83
----------------------------------------------------------------------------------------------
 
Analyzing layers:
Method Avg. Layer Index Std. Dev Layer Index
Max R2 0.392473192 0.2380288695
Heuristic 0.43423652 0.0495630531
 
'''The Max R2 and Heuristic layers are identical in 12.6% of cases!''' Some of these cases are found in city-years with a large number of layers, for instance, there are 90 city-years that have more than 20 startups and identical heuristic and max r2 layers. The table below shows city-years with more than 50 startups and identical heuristic and max R2 layers:
 
{| class="wikitable"
|- style="font-weight:bold;"
! place
! statecode
! year
! numstartups
! chosenhulllayer
! heurflhlayer
|-
| San Francisco
| CA
| 2,009
| 503
| 175
| 175
|-
| Los Angeles
| CA
| 2,012
| 213
| 93
| 93
|-
| Redwood City
| CA
| 2,012
| 151
| 49
| 49
|-
| Redwood City
| CA
| 2,013
| 151
| 49
| 49
|-
| Seattle
| WA
| 2,000
| 113
| 48
| 48
|-
| Houston
| TX
| 2,007
| 92
| 40
| 40
|-
| Waltham
| MA
| 2,012
| 73
| 24
| 24
|-
| Pittsburgh
| PA
| 2,008
| 70
| 25
| 25
|-
| Bellevue
| WA
| 2,001
| 64
| 25
| 25
|-
| Bellevue
| WA
| 2,003
| 61
| 23
| 23
|-
| Pleasanton
| CA
| 2,004
| 54
| 20
| 20
|-
| Menlo Park
| CA
| 2,004
| 52
| 22
| 22
|-
| Durham
| NC
| 2,009
| 50
| 22
| 22
|}
 
In fact, 84% of city-years (which have both heuristic and max R2 layers) have heuristic and max R2 layers that are separated by less than or equal to 5 layers, and 59% have them separated by less than or equal to 2 layers! '''More than a third (36.3%) of city-years have their heuristic and max R2 layers separated by less than or equal to 1 layer.'''
===Another list of items===

Navigation menu