Changes

Jump to navigation Jump to search
no edit summary
{{Project|Has project output=Data|Has sponsor=McNair ProjectsCenter
|Has title=Patent Data Restructure
|Has owner=Marcela Interiano, Sonia Zhang,
|Has start date=201701
|Has deadline=201705
||Has keywords=Patent,Data|Has project status=ActiveSubsume|Does subsume=Patent Data (Wiki Page), Patent Data Cleanup - June 2016, Patent Data Extraction Scripts (Tool), USPTO Bulk Data Processing,
}}
In order to restructure the current patent dataset, the data requires rigorous cleaning. The primary areas for improvement are:
SELECT documentid, max(recorded_date) as recorded_date FROM ptoassigneev2 GROUP BY documentid;
--4374885
 
DROP TABLE ptoassignee_patent;
CREATE TABLE ptoassignee_patent AS
SELECT M1.reelno, M1.frameno, M1.documentid, M1.name, M1.last_update_date, M1.recorded_date, M2.addrline1,
M2.addrline2, M2.city, M2.state, M2.country, M2.postcode
FROM ptotracking2 M1, ptoassigneend M2 WHERE (M1.reelno = M2.reelno) AND (M1.frameno = M2.frameno);
--16581236
 
DROP TABLE ptoassignee_current;
CREATE TABLE ptoassignee_current AS
SELECT M1.reelno, M1.frameno, M2.documentid, M2.recorded_date FROM ptoassignee_patent M1, datecheck M2
WHERE (M1.documentid = M2.documentid)
AND (M1.recorded_date = M2.recorded_date);
--6729698
 
A final version of the ptoassignee_current table was made using ptoassigneev2 given the larger pool of documentids included in the table by matching using documentid and recorded dates from datecheck.
DROP TABLE ptoassignee_current;
FROM ptoassigneev2 M1, datecheck M2
WHERE (M1.documentid = M2.documentid) AND (M1.recorded_date = M2.recorded_date);
--50040384994869 These codes should be used to recreate this table using Sonia's updated address information. ===Matching Application and Publication Numbers===The ptoproperty_cleaned documentids to verify the kind of different patents as specified in the ptoproperty tables.  First the table ptopropertynd was made, including only the distinct documentids in ptoproperty_cleaned.   DROP ptopropertynd; CREATE TABLE ptopropertynd AS SELECT DISTINCT * FROM ptoproperty; --27266638
ADD IN THE HISTPATENT STUFF FOR THE APPNUMBy creating this table, I also address the duplications caused by the kind XO.
===Final Table (name TBD)===

Navigation menu