Changes

Jump to navigation Jump to search
=== <code>ptoassigneend_us_cleaned.postcode_f5_cleanedpostcode_*_cleaned</code> is total garbage === This garbage stretches to include * <code>ptoassigneend_us_cleaned.postcode_f5_cleaned</code>* <code>ptoassigneend_us_cleaned.postcode_cleaned</code> ==== only .15% of records have their postcodes extracted correctly ====  <nowiki>patent=# select count(*) from ptoassigneend_us_cleaned; count --------- 3572605 (1 row) patent=# select count(*) from ptoassigneend_us_cleaned where postcode_cleaned is not null; count --------- 3376480 (1 row) patent=# select count(*) from ptoassigneend_us_cleaned where postcode_f5_cleaned is not null; count ------- 5344 (1 row) patent=# select postcode_cleaned, postcode_f5_cleaned from ptoassigneend_us_cleaned where (postcode_cleaned is not null and postcode_f5_cleaned is null) limit 12; postcode_cleaned | postcode_f5_cleaned -------------------------------------------+--------------------- £\u009B\u0084Ê\u0082Ò£\u009B\u0084Ë\u0082 | £\u009B\u0084Ê\u0082Ò£\u009B\u0084Ë\u0082 | £\u009B\u0084Ê\u0082Ò£\u009B\u0084Ë\u0082 | £\u009B\u0084Ê\u0082Ò£\u009B\u0084Ë\u0082 | £\u009B\u0084Ê\u0082Ò£\u009B\u0084Ë\u0082 | £\u009B\u0084Ê\u0082Ò£\u009B\u0084Ë\u0082 | £\u009B\u0084Ê\u0082Ò£\u009B\u0084Ë\u0082 | £\u009B\u0084Ê\u0082Ò£\u009B\u0084Ë\u0082 | £\u009B\u0084Ê\u0082Ò£\u009B\u0084Ë\u0082 | £\u009B\u0084Ê\u0082Ò£\u009B\u0084Ë\u0082 | £\u009B\u0084Ê\u0082Ò£\u009B\u0084Ë\u0082 | £\u009B\u0084Ê\u0082Ò£\u009B\u0084Ë\u0082 | (12 rows)  patent=# select count(*) from ptoassigneend_us_cleaned where postcode_cleaned = E'£\u009B\u0084Ê\u0082Ò£\u009B\u0084Ë\u0082'; count--------- 3371136(1 row) </nowiki> ==== the underlying data is fine though... ==== <nowiki>patent=# select postcode, postcode_addr1, postcode_addr2, postcode_city from ptoassigneend_us_cleaned where postcode_cleaned = E'£\u009B\u0084Ê\u0082Ò£\u009B\u0084Ë\u0082' limit 5; postcode | postcode_addr1 | postcode_addr2 | postcode_city----------+----------------+----------------+--------------- 75024 | | | 55379 | | | 94538 | | | 23219 | | | 73114 | | |(5 rows)</nowiki> Moreover, selecting from the tables that <code>ptoassigneend_us_cleaned</code> is derived from did not yield this string. Therefore, there is likely an error in the SQL script, perhaps with some wonky copy-pasting from the internet.
== The Ugly ==

Navigation menu