Changes

Jump to navigation Jump to search
====Discarding Outliers====
We don't need to discard outliers, per se, just find a layer where outliers are singletons. One way to do this A wrong approach is to take the highest layer with a single hull (or two hulls or three hulls, etc.) If . It is fair that if a layer never has a hull, then presumably it only has a single location or a line of locations (note that it is possible for a line to have more than 2 locations both because of multitons and because of perfect alignment, given our Google Maps accuracy), so we can discard it.However, this approach will find when there is just one hull left, rather than the last time that there is one hull in decomposition. Possible options and issues:*Find when there are first two hulls and then step back a layer -- but there might never be two hulls, so if there is only ever one hull then find the max layer.*Form a chain from layer 1 on down (somehow) that breaks when there is no longer just one hull. Perhaps count the number links grouping where the chain has one hull or not, or require that the chain contain level 1... [https://jaxenter.com/10-sql-tricks-that-you-didnt-think-were-possible-125934.html]
The base table for this approach is '''hcllayer'''. The variables are '''highest1hulllayer''', highest2hulllayer, and highest3hulllayer in the '''highesthulllayer''' table.
A cubic was a mediocre fit to this data, giving an R2 of 83% but with lots of deviation concentrated right around the local minimum ({-0.0224722, {x -> 0.446655}} [https://www.wolframalpha.com/input/?i=minimum+-2.3595x%5E3+%2B+4.3803x%5E2+-+2.5008x+%2B+0.4309], point of inflection and local maximum. A quartic had an R2 of 90% at around x=0.44 (6.408 x^4 - 15.176 x^3 + 12.592 x^2 - 4.3046 x + 0.517≈0.00825284 at x≈0.440275). I tried a quintic and it had inflection points are x=0.33, 0.55, and 0.82, as well as local maxima at 0.39 and 0.90. Visually there seems to be something going on in the 20% to 40% uncovered range too, perhaps a bifurcation of results, which might be due to rounding issues.
 
====Reasonable Exclusions=====
 
We started by including all U.S. cities that received at least $10m of growth venture capital in a year between 1980 and 2017 inclusive. This gave us a list of 200 cities. However, we still have a lot of city-years with low number of startups. What is a reasonable number of startups to analyze agglomeration? Three locations (which is at least three startups) is the bare minimum required for one hull without excluding outliers.
===Image Analysis===

Navigation menu