Changes

Jump to navigation Jump to search
no edit summary
{{Project|Has project output=Data|Has sponsor=McNair ProjectsCenter
|Has title=Patent Data Restructure
|Has owner=Marcela Interiano, Sonia Zhang,
|Has start date=201701
|Has deadline=201705
||Has keywords=Patent,Data|Has project status=ActiveSubsume|Does subsume=Patent Data (Wiki Page), Patent Data Cleanup - June 2016, Patent Data Extraction Scripts (Tool), USPTO Bulk Data Processing,
}}
In order to restructure the current patent dataset, the data requires rigorous cleaning. The primary areas for improvement are:
===Matching Application and Publication Numbers===
In The ptoproperty_cleaned documentids to verify the kind of different patents as specified in the ptoproperty tables.  First the table ptopropertynd was made, including only the distinct documentids in ptoproperty_cleaned .   DROP ptopropertynd; CREATE TABLE ptopropertynd AS SELECT DISTINCT * FROM ptoproperty; --27266638 By creating this table, I also address the duplications caused by the documentids were matched tokind XO.
===Final Table (name TBD)===

Navigation menu