We would like Return to download and absorb data from this location on the USPTO website into our tables. The objective is to determine whether this dataset is better than the current version of our patent data (a combination of the data in the patent_2015 and patentdata databases[[Patent Data]].
<section begin=bulk />The USPTO provides bulk data recording patent transactions, applications, properties, reassignments, and history through XML files to the general public. These files have been downloaded and the data has been compiled in tables using PostgreSQL. The objective of processing the bulk data is to enhance the McNair Center's historical datasets ([[Patent Data Processing - SQL Steps|patent_2015 and patentdata]]) and track the entirety of US patent activity, specifically concerning utility patents. <section end=bulk /> == Steps Followed to Extract the USPTO Assignees Data ==
===Extracting Data from XML Files ===
Patent properties have a many-to-one relationship : one patent can have more than one properties.
Note: We are not sure what documents with kind 'X0' say
==== Patent Assignment ====
Every XML file download has some fields associated with it, in addition to a number of patent assignment nodes.
Here are the columns in the table:
* reel_no
* frame_no
* action_key_code
* USPTO_Transaction_Date
* USPTO_Date_Produced
* version
Here is what the XML in a downloaded file looks like: