Changes

Jump to navigation Jump to search
no edit summary
<onlyinclude>[[Sonia Zhang]] [[Work Logs]] [[Sonia Zhang (Work Log)|(log page)]]</onlyinclude>
[[Category:Work Log]][[Category:Internal]] [[Sonia Zhang]] [[Work Logs]] [[Sonia Zhang (Work Log)|(log page)]] ===Summer Work2017===
02/23/2017 - Set Up the User Page and the Work Log Page. Got an overview of the patent data.
03/13/2017 - Applied similar methods to extract address information from Japanese patents. The results are stored in 'ptoassigneend_missjapan'. Matched the post code pattern to 200 distinct countries that exist in patent table.
03/14/2017 - As mentioned above, three kinds of Focused on the country and post code information that can be extracted from address columns are city, . Extracted country and post code (plus state information from addrline1 and addrline2 columns for Upatents from Japan and South Korea.S Cleaned the names of country.(Not finished yet).  The problem needed to be solved is that the post code extracted is quite accurate for almost all from some countries follows the same pattern of the street code. For example, based on the countriespattern of [three digits-three digits], records from South Korea and so is German(some) are extracted. 03/15/2017 - Cleaned the country information (column. 03/16/2017 - 03/27/2017 - Restructure addrline1, addrline2 and the state for U.Scity features.)See [[Patent Data Restructure]].
The problem is that the city information extracted is not quite good. It messes up with street names. One approach to increase the accuracy is to list all the possible cities in each country, and then match the address columns to these cities, which is time consuming.[[Category:Work Log]]

Navigation menu