Changes

Jump to navigation Jump to search
no edit summary
{{Project|Has project output=Data|Has sponsor=McNair ProjectsCenter
|Has title=Patent Data Restructure
|Has owner=Marcela Interiano, Sonia Zhang,
|Has start date=201701
|Has deadline=201705
||Has keywords=Patent,Data|Has project status=ActiveSubsume|Does subsume=Patent Data (Wiki Page), Patent Data Cleanup - June 2016, Patent Data Extraction Scripts (Tool), USPTO Bulk Data Processing,
}}
In order to restructure the current patent dataset, the data requires rigorous cleaning. The primary areas for improvement are:
===Matching Application and Publication Numbers===
The ptoproperty_cleaned documentids were checked again to see which numbers would watch with verify the patent kind of different patents as specified in the ptoproperty tables.  First the table ptopropertynd was made, including only the distinct documentids in ptoproperty_cleaned.   DROP ptopropertynd; CREATE TABLE ptopropertynd AS SELECT DISTINCT * FROM ptoproperty; --27266638 By creating this table, I also address the duplications caused by the kind XO.
===Final Table (name TBD)===

Navigation menu