Difference between revisions of "Patent Assignment Data Restructure"

From edegan.com
Jump to navigation Jump to search
Line 1: Line 1:
 
In order to restructure the current patent dataset, the data requires rigorous cleaning. The primary areas for improvement are:
 
In order to restructure the current patent dataset, the data requires rigorous cleaning. The primary areas for improvement are:
 
:1. Clean ptoassignment table to unique keys.  
 
:1. Clean ptoassignment table to unique keys.  
:2. Clean ptoproperties to remove nonutility patents (including patent numbers, application numbers, something else that we haven't matched yet).  
+
:2. Clean ptoproperties to remove nonutility patents (including patent numbers, application :numbers, something else that we haven't matched yet).  
 
:3. Clean ptoassignee to extract address components and clean it up.
 
:3. Clean ptoassignee to extract address components and clean it up.
 
:4. Check all patent numbers accounted for in ptoassignee_currentusa
 
:4. Check all patent numbers accounted for in ptoassignee_currentusa
 
:5. Correspondence address clean up.
 
:5. Correspondence address clean up.
 
:6. Transform structure.
 
:6. Transform structure.

Revision as of 16:01, 2 March 2017

In order to restructure the current patent dataset, the data requires rigorous cleaning. The primary areas for improvement are:

1. Clean ptoassignment table to unique keys.
2. Clean ptoproperties to remove nonutility patents (including patent numbers, application :numbers, something else that we haven't matched yet).
3. Clean ptoassignee to extract address components and clean it up.
4. Check all patent numbers accounted for in ptoassignee_currentusa
5. Correspondence address clean up.
6. Transform structure.