Changes

Jump to navigation Jump to search
::*'CITY NAME[,] STATE POSTCODE'
:::The state and post code are always together, separated by a space. So we can also extract state information with regular expression
'([,]|[.])\s\w{2,}\s{0,}\w{0,}\s{1,}\d{5}[-]\d{4}'
:::For example,
city | state_city
NEW YORK, NY 10022-3201 | NY
'(^|\s)\w{2}\s{1}\d{5}[-]\d{4}'
::: For example:
NEW YORK NY 10022-3201
WAUKEGAN IL 60085-2195
::* 'D.C.' - dots between state name.
'D[.]C[.]\s\d{5}-\d{4}'
BOX 87703CHICAGO, IL 60680-0703
:::SQL code is in:
E:/McNair/Projects/PatentAddress/CityPatterns.sql
postcode | character varying(80) |
:::The ptoassigneend_allus table may miss some U.S. patents.
::*ptoassigneend_missus_final
:::State and postcode information are extracted from addrline1, addrline2 and city columns and are stored in ptoassigneend_missus_final table. See section 3 and 4.
:::This table is a subset of ptoassigneend_allus table.
Table "public.ptoassigneend_missus_final"
state_addr2 | text |
:::postcode_city is the postcode extracted from 'city'. postcode_addr1 is the postcode extracted from 'addrline1'. postcode_addr2 is the postcode extracted from 'addrline2'.
:::state_city is the state name extracted from 'city'. state_addr1 is the state name extracted from 'addrline1'. state_addr2 is the state name extracted from 'addrline2'.
::*ptoassigneend_city
:::City information are extracted from addrline1, addrline2 and city columns and are stored in ptoassigneend_city table. See section 5.
:::This table is a subset of ptoassigneend_allus table.
:'''6. Issues'''
http://stackoverflow.com/questions/578406/what-is-the-ultimate-postal-code-and-zip-regex
::: For example,
"US", "\d{5}([ \-]\d{4})?"
::* Some state and country features don't match.
:::For example:
addrline2 | city | country

Navigation menu