Changes

Jump to navigation Jump to search
no edit summary
The dbase is '''patent'''.
 
The table is '''ptoassigneend'''.
SQL code and other things are in:
Z:/PatentAddress
NotesIntroduction:*The Five features (addrline1 and , addrline2 columns include post code, city and state , country, postcode) in the table contains address information while the state.*Features addrline1, post code addrline2 and city are not cleaned. They have city, country columns may have missing valuesand postcode information. *BesidesThe object is to extract city, some city records also include post code country and country postcode informationfrom three features above.*By now, we only focus on and clean American patents.
Methods:*The basic idea to extract information from addrline1 and addrline2 is to search for find post code following a specific pattern using regular expression. The state information is always ahead of post code.
U.S. postcode is like [five digits - four digits]. In this way, I created a table named 'ptoassigneend_missus' to store records containing [five digits - four digits]. Then, using the method above to extract useful address information.

Navigation menu