Difference between revisions of "Redesign Assignment and Patent Database"

McNair Project
Redesign Assignment and Patent Database
Project Information
Project Title	Redesign Assignment and Patent Database
Owner	Shelby Bice
Start Date	9/2017
Deadline
Keywords	patent
Primary Billing
Notes
Has project status
Is dependent on	Reproducible Patent Data
	Copyright © 2016 edegan.com. All Rights Reserved.

Revision as of 10:36, 26 September 2017

This is an extension of the work I did last semester under "Redesigning Patent Database". Instead of simply reconfiguring the existing database, this project encompasses and full redesign and creation of a new Patent database and a new Assignment database that will be joined together.

Assignment Database Structure

After reading through the paper "The USPTO Patent Assignment Dataset: Descriptions and Analysis"(https://www.uspto.gov/sites/default/files/documents/USPTO_Patents_Assignment_Dataset_WP.pdf) and taking into account are needs for the Assignment database (the ability to connect the two databases together, the ability to trace conveyance of a patent over time) I decided on the following structure for the Assignment database.

The foreign key that connects every table in the database is rf_id, which stands for reel frame id. It is unique for each entry in the main table, assignment and is included in every table.

Conveyance is the only table that cannot be fully constructed as data is inserted into the database. Some of the fields depend on other tables being constructed, so it may be either partially populated while the other tables are being populated or is the last table to be populated.

Full list of tables: assignee, assignor, document_info, assignment, conveyance

ASSIGNEE

Purpose: store all the assignees from all the assignments

Each entry represents: the recipient of a patent in an entry in the ASSIGNMENT table

Primary key: rf_id, unless it is possible to have multiple assignees, then rf_id and assignee name

All columns (variables) in the table: rf_id, name, assignee, address_line_1, address_line_2, city, state, country, postal_code

ASSIGNOR

Purpose: store all the assignors from all the assignments

Each entry represents: the one assigning/granting the license for the entry in the ASSIGNMENT table

Primary key: rf_id and assignor_name

All columns (variables): rf_id, assignor_name

CONVEYANCE

Purpose: represent the conveyance of a particular patent overtime

Each entry represents: an individual transaction for a patent (not unlike assignment, except this table has information about the assignment in relation to other assignments for the patent

Primary key: rf_id

All columns (variables): Two possibilities, depending on whether we want data to be repeated or not.

       Not-repeated (with exception of primary/foreign key): rf_id, next_transaction (also rf_id), num_conveyance (number in order of conveyance by date), acknowledgment_date (in ASSIGNOR for the assignment data paper but would be moved here. Could originally be stored in ASSIGNOR until CONVEYANCE is generated), execution_date (same deal as acknowledgment date, should probably be moved here; if there are multiple execution dates, we will store the last one) conveyance_type

       Repeated, also include: assignee, assignor, patent_id

DOCUMENT_INFO

Purpose: store extra information relevant to the patents represented in ASSIGNMENT

Each entry represents: information about the documentation and the patent for a particular entry in ASSIGNMENT

Primary key: rf_id

All columns (variables) when created: rf_id, app_num (application document USPTO number), app_date, app_country, pgpub_num (pre-grant publication document USPTO number), pgpub_date, pgpub_country, grant_num (granted patent document USPTO number), grant_date, grant_country, invention_title, language (of invention title – could be potentially useful and interesting to investigate), reel_num, frame_num (reel_num and frame_num were in ASSIGNMENT for the uspto data paper, but I think they would be more useful here)

ASSIGNMENT

Purpose: represent assignment transactions – central table for the database

Each entry represents: an individual patent assignment transaction

Primary key: rf_id

All columns (variables): rf_id, assignment_id (US patent assignment id), correspondent_name, correspondent_address1, correspondent_address_2, correspondent_address3, correspondent_address4, record_date (date recorded with USPTO), last_update_date, page_cnt (page count of assignment record)

More extensive notes exist under :E/McNair/Projects/Redesigning Patent Database/New Patent Database Project/Notes on USPTO Assignment Data Paper

Difference between revisions of "Redesign Assignment and Patent Database"

Revision as of 10:36, 26 September 2017

Contents

Assignment Database Structure

ASSIGNEE

ASSIGNOR

CONVEYANCE

DOCUMENT_INFO

ASSIGNMENT

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools

@@ Line 10: / Line 10: @@
 ==Assignment Database Structure==
-After reading through the paper ["The USPTO Patent Assignment Dataset:
+After reading through the paper "The USPTO Patent Assignment Dataset:
-Descriptions and Analysis"](https://www.uspto.gov/sites/default/files/documents/USPTO_Patents_Assignment_Dataset_WP.pdf) and taking into account are needs for the Assignment database (the ability to connect the two databases together, the ability to trace conveyance of a patent over time) I decided on the following structure for the Assignment database.
+Descriptions and Analysis"(https://www.uspto.gov/sites/default/files/documents/USPTO_Patents_Assignment_Dataset_WP.pdf) and taking into account are needs for the Assignment database (the ability to connect the two databases together, the ability to trace conveyance of a patent over time) I decided on the following structure for the Assignment database.
 The foreign key that connects every table in the database is rf_id, which stands for reel frame id. It is unique for each entry in the main table, assignment and is included in every table.
@@ Line 21: / Line 21: @@
 ===ASSIGNEE===
 Purpose: store all the assignees from all the assignments
 Each entry represents: the recipient of a patent in an entry in the ASSIGNMENT table
 Primary key: rf_id, unless it is possible to have multiple assignees, then rf_id and assignee name
 All columns (variables) in the table: rf_id, name, assignee, address_line_1, address_line_2, city, state, country, postal_code
 ===ASSIGNOR===
-	Purpose: store all the assignors from all the assignments
-	Each entry represents: the one assigning/granting the license for the entry in the ===ASSIGNMENT===
+Purpose: store all the assignors from all the assignments
-	Primary key: rf_id and assignor_name
-	All columns (variables): rf_id, assignor_name
+Each entry represents: the one assigning/granting the license for the entry in the ASSIGNMENT table
+Primary key: rf_id and assignor_name
+All columns (variables): rf_id, assignor_name
 ===CONVEYANCE===
-	Purpose: represent the conveyance of a particular patent overtime
+Purpose: represent the conveyance of a particular patent overtime
-	Each entry represents: an individual transaction for a patent (not unlike assignment, except this table has information about the assignment in relation to other assignments for the patent
-	Primary key: rf_id
+Each entry represents: an individual transaction for a patent (not unlike assignment, except this table has information about the assignment in relation to other assignments for the patent
-	All columns (variables): Two possibilities, depending on whether we want data to be repeated or not.
-	Not-repeated (with exception of primary/foreign key): rf_id, next_transaction (also rf_id), num_conveyance (number in order of conveyance by date), acknowledgment_date (in ASSIGNOR for the assignment data paper but would be moved here. Could originally be stored in ASSIGNOR until CONVEYANCE is generated), execution_date (same deal as acknowledgment date, should probably be moved here; if there are multiple execution dates, we will store the last one) conveyance_type
+Primary key: rf_id
-	Repeated, also include: assignee, assignor, patent_id
+All columns (variables): Two possibilities, depending on whether we want data to be repeated or not.
+        Not-repeated (with exception of primary/foreign key): rf_id, next_transaction (also rf_id), num_conveyance (number in order of conveyance by date), acknowledgment_date (in ASSIGNOR for the assignment data paper but would be moved here. Could originally be stored in ASSIGNOR until CONVEYANCE is generated), execution_date (same deal as acknowledgment date, should probably be moved here; if there are multiple execution dates, we will store the last one) conveyance_type
+        Repeated, also include: assignee, assignor, patent_id
 ===DOCUMENT_INFO===
-	Purpose: store extra information relevant to the patents represented in ASSIGNMENT
+Purpose: store extra information relevant to the patents represented in ASSIGNMENT
-	Each entry represents: information about the documentation and the patent for a particular entry in ASSIGNMENT
-	Primary key: rf_id
+Each entry represents: information about the documentation and the patent for a particular entry in ASSIGNMENT
-	All columns (variables) when created: rf_id, app_num (application document USPTO number), app_date, app_country, pgpub_num (pre-grant publication document USPTO number), pgpub_date, pgpub_country, grant_num (granted patent document USPTO number), grant_date, grant_country, invention_title, language (of invention title – could be potentially useful and interesting to investigate), reel_num, frame_num (reel_num and frame_num were in ASSIGNMENT for the uspto data paper, but I think they would be more useful here)
+Primary key: rf_id
+All columns (variables) when created: rf_id, app_num (application document USPTO number), app_date, app_country, pgpub_num (pre-grant publication document USPTO number), pgpub_date, pgpub_country, grant_num (granted patent document USPTO number), grant_date, grant_country, invention_title, language (of invention title – could be potentially useful and interesting to investigate), reel_num, frame_num (reel_num and frame_num were in ASSIGNMENT for the uspto data paper, but I think they would be more useful here)
 ===ASSIGNMENT===
-	Purpose: represent assignment transactions – central table for the database
+Purpose: represent assignment transactions – central table for the database
-	Each entry represents: an individual patent assignment transaction
-	Primary key: rf_id
+Each entry represents: an individual patent assignment transaction
-	All columns (variables): rf_id, assignment_id (US patent assignment id), correspondent_name, correspondent_address1, correspondent_address_2, correspondent_address3, correspondent_address4, record_date (date recorded with USPTO), last_update_date, page_cnt (page count of assignment record)
+Primary key: rf_id
+All columns (variables): rf_id, assignment_id (US patent assignment id), correspondent_name, correspondent_address1, correspondent_address_2, correspondent_address3, correspondent_address4, record_date (date recorded with USPTO), last_update_date, page_cnt (page count of assignment record)
 More extensive notes exist under :E/McNair/Projects/Redesigning Patent Database/New Patent Database Project/Notes on USPTO Assignment Data Paper