Changes

Jump to navigation Jump to search
[https://www2.census.gov/geo/tiger/TIGER2020/CBSA/tl_2020_us_cbsa.zip Shapefiles from the 2020 U.S. Census TIGER/Line data series] provide the boundaries and names of the MSAs, and a python script (Geocode.py) in conjunction with a [https://developers.google.com/maps/documentation/distance-matrix Google Maps API], provides longitudes and latitudes for startups. We restrict the accuracy of Google’s results to four decimal places, which is [http://wiki.gis.com/wiki/index.php/Decimal_degrees approximately 10m of precision].
All of our data assembly, and much of our data processing and analysis, is done in a [[https://www.postgresql.org/ PostgreSQL]] [https://postgis.net/ PostGIS] database. See our [[Research Computing Infrastructure]] page for more information.
However, we rely on [[https://www.python.org/ python]] scripts to retrieve addresses from Google Maps, as well as compute the [https://en.wikipedia.org/wiki/Hierarchical_clustering Hierarchical Cluster Analysis (HCA)] itself, and estimate a cubic to determine the HCA-regression method agglomeration count for an [https://en.wikipedia.org/wiki/Metropolitan_statistical_area MSA]. We also use two [https://www.stata.com/ Stata] scripts: one to compute the HCA-regressions, and another to estimate the paper's summary statistics and regression specifications. Finally, we use QGIS to construct the map images based on queries to our database. These images use a [https://maps.google.com Google Maps] base layer.

Navigation menu