We would like Return to download and absorb data from this location on the USPTO website into our tables. The objective is to determine whether this dataset is better than the current version of our patent data (a combination of the data in the patent_2015 and patentdata databases[[Patent Data]].
<section begin=bulk />The USPTO provides bulk data recording patent transactions, applications, properties, reassignments, and history through XML files to the general public. These files have been downloaded and the data has been compiled in tables using PostgreSQL. The objective of processing the bulk data is to enhance the McNair Center's historical datasets ([[Patent Data Processing - SQL Steps|patent_2015 and patentdata]]) and track the entirety of US patent activity, specifically concerning utility patents. <section end=bulk /> == Steps Followed to Extract the USPTO Assignees Data ==
Each of the above internal nodes is mandatory, and is a logical grouping of information fields. Each node has a corresponding table created with more or less the same fields as the XML elements.
Corresponding tables are:
*assignment-records : assignment
*patent-assignors : assignors
*patent-assignees : assignees
*patent-properties : properties
Additionally, for each file that is downloaded, there are some associated specs. All of these are stored in the PatentAssignment table. Here is the data model diagram.
==== Assignment Records ====
The fields in the assignment record are:
* last_update_date
* purge_indicator
* recorded_date
* correspondent_name
* correspondent_address_1
* correspondent_address_2
* correspondent_address_3
* correspondent_address_4
* conveyance_text
Here is the corresponding XML that we are mapping: