Changes

Jump to navigation Jump to search
A visual inspection suggests that Stamford and Norwalk might be better combined but don't really matter. Minneapolis and St. Paul are pretty separate and really separate after removing outliers. Rarleigh and Durham are completely separate (Cary is more of an issue), as are Dallas and Fort Worth and SF and Oakland.
 
=====Encapsulation=====
 
The data suggests that there are 12 places that encapsulated by 7 other places:
SELECT A.place, A.statecode, B.place AS ContainedPlace, B.statecode AS ContainedStatecode
FROM placetigerarea AS A
JOIN placetigerarea AS B ON st_contains(ST_ConvexHull(A.placegeog::geometry),ST_ConvexHull(B.placegeog::geometry))
WHERE NOT (A.place=B.place AND A.statecode=B.statecode);
--12
 
place statecode containedplace containedstatecode
Los Angeles CA Culver City CA
Los Angeles CA Torrance CA
Los Angeles CA El Segundo CA
Los Angeles CA Santa Monica CA
San Jose CA Santa Clara CA
Fremont CA Newark CA
Oakland CA Emeryville CA
Cary NC Morrisville NC
New York NY Jersey City NJ
Dallas TX Richardson TX
Dallas TX Addison TX
Dallas TX Farmers Branch TX
 
We could ignore, flag or discard these cites. A visual inspection suggests that Culver City, Torrence, El Segundo, Jersey City, and probably Richardson, Newark, and maybe Cary don't have any issues. Santa Monica, Santa Clara, Emeryville, Farmer's Branch and Addison do look like they have issues, but with the exception of Farmer's Branch and Addison, these are big cites and with lots of locations, so the issue should be washed out by removing outliers or otherwise appropriately choosing the clustering layer.
===First Estimation(s)===

Navigation menu