Changes

Jump to navigation Jump to search
no edit summary
This paper is published as:
 
[[Delineating Spatial Agglomerations|Egan, Edward J. and James A. Brander (2022), "New Method for Identifying and Delineating Spatial Agglomerations with Application to Clusters of Venture-Backed Startups.", Journal of Economic Geography, Manuscript: JOEG-2020-449.R2, forthcoming.]]
 
{{AcademicPaper
|Has title=Urban Start-up Agglomeration and Venture Capital Investment
|Has author=Ed Egan,Jim Brander
|Has RAs=Peter Jalbert, Jake Silberman, Christy Warden, Jeemin Sim
|Has paper status=Working paperPublished
}}
=New Submission=
A revised version of the paper, now co-authored with [[Jim Brander]] and based on the version 3 rebuild, was submitted to the Journal of Economic Geography. This is solely a methods paper, and is titled: '''A New Method for Identifying and Delineating Spatial Agglomerations with Application to Clusters of Venture-Backed Startups'''. The policy application would need to be written up as a separate paper.
Files:*Pdf: [[File:Egan Brander (2020) - A New Method for Identifying and Delineating Spatial Agglomerations (Submitted to JEG).pdf]]*In E:\projects\agglomeration**Last document was Agglomeration Dec 15.docx**Build is Version 3-6-2-2. ==Acceptance==
==Notes for further improvement==On July 5th 2022, the paper was accepted to the Journal of Economic Geography:
We might want to add some things in/back in* Manuscript ID JOEG-2020-449. These include technical notesR2* Title:A New Method for Identifying and Delineating Spatial Agglomerations with Application to Clusters of Venture-Backed Startups*To do the HCA we used the AgglomerativeClustering method from the sklearn.cluster library Author(version 0s): Edward J.20Egan and James A.1) in python 3Brander.7* Editor: Bill Kerr, HBS: wkerr@hbs.1, with Ward linkage edu* Abstract: This paper advances a new approach using hierarchical cluster analysis (HCA) for identifying and delineating spatial agglomerations and connectivity set applies it to noneventure-backed startups. This method is documented here: https://scikit-learnHCA identifies nested clusters at varying aggregation levels.org/stable/modules/clusteringWe describe two methods for selecting a particular aggregation level and the associated agglomerations.htmlThe “elbow method” relies entirely on geographic information. I checked some of the early results against an implementation of Ward's Our preferred method using the agnes function, available through the cluster package“regression method”, in R. https://www.rdocumentation.org/packages/cluster/versions/2.1.0/topics/agnesalso uses*The venture capital investment data was assembled and processed in identifies finer agglomerations, often the size of a Postgresql (version 10) database using PostGIS (version 2.4)small neighborhood. We used World Geodetic System revision 84, known as WGS1984 (see https://en.wikipedia.org/wiki/World_Geodetic_System), as a coordinate system with an ellipsoidal earth, use heat maps to calculate distances illustrate how agglomerations evolve and areas (see https://postgisindicate how our methods can aid in evaluating agglomeration support policies.net* Permanent link for code/docs/manual-2.4/using_postgis_dbmanagement.html). Shapefiles for Census Places were retrieved from the U.S. Census TIGER (Topologically Integrated Geographic Encoding and Referencing) database (see data: https://www.censusedegan.govcom/programs-surveyswiki/geography.html).*The statistical analysis was done in STATA/MP version 15.*All maps were made using QGIS v3.8.3. The base map is from Google Maps. City areas are highlighted using U.S. Census TIGER/Line Shapefiles. Delineating_Spatial_Agglomerations
The methodology has other applications:*Food deserts - one could study paper is now in production. I will build a wiki page called [[Delineating_Spatial_Agglomerations]] that structures the agglomerations documentation of restaurants the build process and other food providers in urban environmentsshares code and some data or artifacts.*AirportsCurrently, cement factories, banana plantations, police/fire stations, hospitals/drug stores, etcthat page redirects here.*We could think about commercial applications. Perhaps locating plants/facilities that are/aren't in clusters with a view to buying or selling them?
== R&R ==
 
Files:
*Pdf: [[File:Egan Brander (2020) - A New Method for Identifying and Delineating Spatial Agglomerations (Submitted to JEG).pdf]]
* In E:\projects\agglomeration
** Last document was Agglomeration Dec 15.docx
** Build is Version 3-6-2-2.
** SQL file is: AgglomerationVcdb4.sql
After some inquiries, we heard from Bill Kerr, the associate editor, that the paper had new reviews on Aug 11th. On Aug 23rd, we recieved an email titled "Journal of Economic Geography - Decision on Manuscript ID JOEG-2020-449" giving us an R&R. Overall, the R&R is very positive.
* Starting Units. Suggests MSA.
* Explain R2 method better. He didn't say try cluster-level but that might be helpful to him too.
* Change language (back) to microgeographies! (or startup neighborhoods).
* Tighter connection to lit. He gives papers to start.
* Discuss overlap of clusters (a la patent clustering). Check findings in Kerr and Kominers!!!
<pdf>File:JOEG1RndReviews.pdf</pdf>
 =Previous ==Notes for further improvement=== We might want to add some things in/back in. These include technical notes:*To do the HCA we used the AgglomerativeClustering method from the sklearn.cluster library (version 0.20.1) in python 3.7.1, with Ward linkage and connectivity set to none. This method is documented here: https://scikit-learn.org/stable/modules/clustering.html. I checked some of the early results against an implementation of Ward's method using the agnes function, available through the cluster package, in R. https://www.rdocumentation.org/packages/cluster/versions/2.1.0/topics/agnes*The data was assembled and processed in a Postgresql (version 10) database using PostGIS (version 2.4). We used World Geodetic System revision 84, known as WGS1984 (see https://en.wikipedia.org/wiki/World_Geodetic_System), as a coordinate system with an ellipsoidal earth, to calculate distances and areas (see https://postgis.net/docs/manual-2.4/using_postgis_dbmanagement.html). Shapefiles for Census Places were retrieved from the U.S. Census TIGER (Topologically Integrated Geographic Encoding and Referencing) database (see https://www.census.gov/programs-surveys/geography.html).*The statistical analysis was done in STATA/MP version 15.*All maps were made using QGIS v3.8.3. The base map is from Google Maps. City areas are highlighted using U.S. Census TIGER/Line Shapefiles.  The methodology has other applications:*Food deserts - one could study the agglomerations of restaurants and other food providers in urban environments.*Airports, cement factories, banana plantations, police/fire stations, hospitals/drug stores, etc.*We could think about commercial applications. Perhaps locating plants/facilities that are/aren't in clusters with a view to buying or selling them? =SSRN versionof the paper (uses v2 build)=
There are two 'final' papers based on the version 2 build. The one with Houston narrative as the motivation is available from SSRN: https://papers.ssrn.com/abstract=3537162
===Another round of refinements===
#The elbow method is pretty questionable has issues in its current form, so we are going to try using the elbow in the curvature (degree of concavity) instead.
#We might also try using elasticities...
#Rerun the distance calculations -- avghulldisthm and avgdisthm are only computed for layers that we select with some method (like max r2). However, this table hadn't been updated for the elbow method, perhaps as well as some other methods, so some distances would have been missing (and replaced with zeros in the STATA script).
=Old Work Using Circles=
 
See: [[Enclosing Circle Algorithm]]
==Very Old Summary==

Navigation menu