Changes

Jump to navigation Jump to search
1,225 bytes added ,  13:41, 21 September 2020
no edit summary
{{Project
|Has project output=Data
|Has title=Restoring vcdb3
|Has owner=Ed Egan,
==Fix the Geocoding==
The longitudes and latitudes were stored as numeric, rather than real, but this doesn't seem to the source of the problem as it has suitable precision[https://www.postgresql.org/docs/current/datatype-numeric.html]. Instead the problem seems to be in the loading and unloading of tables through the \COPY command.  Regardless, we have decimal degrees to six decimal places from Google Maps, which is equivalent to a staggering 11cm of accuracy at the equator. See http://wiki.gis.com/wiki/index.php/Decimal_degrees.
There are some notes on [[VentureXpert_Data#GeoCoding_Companies.2C_Firms.2C_and_Branch_Offices]] and the SQL appears to be in:
E:\mcnair\Projects\VentureXpert Database\vcdb3\LoadingScripts\GeoLoad.sql
===Company Process=== The company process began with a load of oldgeocords 44740 from companybasegeomaster.txt (which came from vcdb2). The problem is that companybasegeomaster has low precision data in it. This was documented in [[VC_Database_Rebuild#Re-Fixing_erroneous_geo-coordinates]]. There's good data, but keyed solely by companyname, in vcdb2:*Geocore 43628 (with nulls), keyed by coname (colevelsimple)*Geoallcoords 44999 (with nulls), keyed by coname, datefirst, statecode*Geoallcoords1 44999 (with nulls), keyed by coname, datefirst, statecode, with exclude flag
The next step in the company geo process was to load remaining from RemainingLatLong.txt. This is high precision data.
 
The fix to the geo data is at the top of:
E:\projects\vcdb3\RevisedDbaseCode.sql
 
Essentially this fix:
*DROPs portcogeo; oldgeocords; geoallcoords; geoallcoords1; goodgeoold; geoallcoords2; geoallcoords3; goodgeonew; geocodesportco;
*Load geoallcoords1fromvcdb2 (which comes from vcdb2!)
*Merge the results of '''geoallcoords1fromvcdb2''' with the resutls of '''remaining'''
*Mark the out-of-bounds exclusions
*Produce '''portcogeo''' 47715
 
===Firm Process===
 
For the firms, there are three sources, all of which are high precision:
*bogeo 2046 from BranchOfficesGeo.txt
*oldfirmcoords 5556 from FirmCoords.txt
*firmremainingcoords 706 from FirmRemainingCoords.txt
 
These produce:
*bogeo 2046
*firmgeocoords 6049
*firmbasecore 15437, which is large because it contains non-US

Navigation menu