Difference between revisions of "Patent Schema Reconciliation"

From edegan.com
Jump to navigation Jump to search
 
(7 intermediate revisions by the same user not shown)
Line 29: Line 29:
  
 
Added xpaths to both [[Equivalent XPath and APS Queries]], and Patent Schema Reconciliation text.txt in E:\McNair\Projects\SimplerPatentData\data\examples\Patent Schema Reconciliation for the following nodes.  In [[Equivalent XPath and APS Queries]], listed examples:
 
Added xpaths to both [[Equivalent XPath and APS Queries]], and Patent Schema Reconciliation text.txt in E:\McNair\Projects\SimplerPatentData\data\examples\Patent Schema Reconciliation for the following nodes.  In [[Equivalent XPath and APS Queries]], listed examples:
 +
 +
Granted[edit]
 +
 +
 +
 +
strings
 +
 +
  
 
PATENT_TYPE
 
PATENT_TYPE
 +
 
TITLE
 
TITLE
 +
 
PCT_DOCUMENT_NUMBER
 
PCT_DOCUMENT_NUMBER
 +
 
PATENT_COUNTRY
 
PATENT_COUNTRY
 +
 
PATENT_NUMBER
 
PATENT_NUMBER
 +
 
PATENT_KIND
 
PATENT_KIND
 +
 
PATENT_GRANT_DATE
 
PATENT_GRANT_DATE
 +
 
APPLICATION_NUMBER
 
APPLICATION_NUMBER
 +
 
APPLICATION_FILING_DATE
 
APPLICATION_FILING_DATE
 +
 
PRIORITY_CLAIMS_DATE
 
PRIORITY_CLAIMS_DATE
 +
 
PRIORITY_CLAIMS_COUNTRY
 
PRIORITY_CLAIMS_COUNTRY
 +
 
PRIORITY_CLAIMS_PATENT_NUMBER
 
PRIORITY_CLAIMS_PATENT_NUMBER
 +
 
IPCR_SUBCLASS
 
IPCR_SUBCLASS
 +
 
IPCR_MAIN_GROUP
 
IPCR_MAIN_GROUP
 +
 
IPCR_SUB_GROUP
 
IPCR_SUB_GROUP
  
 +
CPC_SUBCLASS
  
 +
CPC_MAIN_GROUP
  
Only checked/updated the following nodes in [[Equivalent XPath and APS Queries]], and listed examples.  THESE STILL HAVE TO BE ADDED TO  Patent Schema Reconciliation text.txt in E:\McNair\Projects\SimplerPatentData\data\examples\Patent Schema Reconciliation.
+
CPC_SUB_GROUP
  
CPC_SUBCLASS
 
CPC_MAIN_GROUP
 
CPC_SUB_GROUP
 
 
CLASSIFICATION_NATIONAL_COUNTRY
 
CLASSIFICATION_NATIONAL_COUNTRY
 +
 
CLASSIFICATION_NATIONAL_CLASS
 
CLASSIFICATION_NATIONAL_CLASS
 +
 
PRIMARY_EXAMINER_FIRST_NAME
 
PRIMARY_EXAMINER_FIRST_NAME
 +
 
PRIMARY_EXAMINER_LAST_NAME
 
PRIMARY_EXAMINER_LAST_NAME
 +
 
PRIMARY_EXAMINER_DEPARTMENT
 
PRIMARY_EXAMINER_DEPARTMENT
 +
 +
 +
 +
numbers
 +
 +
 
NUMBER_OF_CLAIMS
 
NUMBER_OF_CLAIMS
 +
 +
 +
 +
applicants
 +
 +
 +
 
SEQUENCE
 
SEQUENCE
 +
 
LAST_NAME
 
LAST_NAME
 +
 
FIRST_NAME
 
FIRST_NAME
  
 +
ORG_NAME
  
I haven't looked for or verified any other xpaths for the other nodes, which include:
+
CITY
  
(Under "Citations"):
+
COUNTRY
 +
 
 +
STATE
 +
 
 +
ADDRESS
 +
 
 +
POSTCODE
 +
 
 +
 
 +
citations
 +
 
 +
 
 +
 
 +
CITATION_DESCRIPTION
 +
 
 +
CITATION NUMBER
 +
 
 +
NPL CITATION NUMBER
 +
 
 +
COUNTRY
 +
 
 +
CITATIONS DOC NUMBER
 +
 
 +
CITATIONS KIND
 +
 
 +
CITATIONS NAME
 +
 
 +
CITATIONS DATE
 +
 
 +
SEQUENCE
 +
 
 +
LAST_NAME
 +
 
 +
FIRST_NAME
 +
 
 +
CITY
 +
 
 +
COUNTRY
 +
 
 +
STATE
 +
 
 +
ADDRESS
 +
 
 +
LAST_NAME
 +
 
 +
FIRST_NAME
  
 
ORG_NAME
 
ORG_NAME
 +
 
CITY
 
CITY
COUNTRY  
+
 
 +
COUNTRY
 +
 
 
STATE
 
STATE
 +
 
ADDRESS
 
ADDRESS
POSTCODE
 
CITATION_DESCRIPTION
 
  
and
 
  
  
All listed nodes under "inventors", "assignments", and "lawyers"
+
lawyers  
 +
 
 +
SEQUENCE
 +
 
 +
FIRST_NAME
 +
 
 +
ORG_NAME
  
 
==Useful links==
 
==Useful links==

Latest revision as of 18:56, 14 November 2017

Example files

E:\McNair\Projects\SimplerPatentData\data\examples

There are two sets:

  • Granted
  • Applications

Applications contains just utility and some plant, whereas granted contains design, plant, reissue, and utility patents (i.e., all four types of patents). Both applications and granted have multiple versions (e.g., v4.5, v4.4, v4.3, ..., etc.).

The Task

For both sets (starting with granted), all types, and all versions, we need to identify the xpath (or APS equivalent, see below) for each node.

A node is something like:

  • patent number (it shows up as document_id)
  • filing number (it also shows up as a document_id in another place)
  • grant date
  • kind
  • type
  • applicationnumber
  • filingdate

Some nodes are lists of other nodes, for example the assignees node contains multiple assignment records.

Task Notes

Details from Joe Reilly Work Logs (log page)

Added xpaths to both Equivalent XPath and APS Queries, and Patent Schema Reconciliation text.txt in E:\McNair\Projects\SimplerPatentData\data\examples\Patent Schema Reconciliation for the following nodes. In Equivalent XPath and APS Queries, listed examples:

Granted[edit]


strings


PATENT_TYPE

TITLE

PCT_DOCUMENT_NUMBER

PATENT_COUNTRY

PATENT_NUMBER

PATENT_KIND

PATENT_GRANT_DATE

APPLICATION_NUMBER

APPLICATION_FILING_DATE

PRIORITY_CLAIMS_DATE

PRIORITY_CLAIMS_COUNTRY

PRIORITY_CLAIMS_PATENT_NUMBER

IPCR_SUBCLASS

IPCR_MAIN_GROUP

IPCR_SUB_GROUP

CPC_SUBCLASS

CPC_MAIN_GROUP

CPC_SUB_GROUP

CLASSIFICATION_NATIONAL_COUNTRY

CLASSIFICATION_NATIONAL_CLASS

PRIMARY_EXAMINER_FIRST_NAME

PRIMARY_EXAMINER_LAST_NAME

PRIMARY_EXAMINER_DEPARTMENT


numbers


NUMBER_OF_CLAIMS


applicants


SEQUENCE

LAST_NAME

FIRST_NAME

ORG_NAME

CITY

COUNTRY

STATE

ADDRESS

POSTCODE


citations


CITATION_DESCRIPTION

CITATION NUMBER

NPL CITATION NUMBER

COUNTRY

CITATIONS DOC NUMBER

CITATIONS KIND

CITATIONS NAME

CITATIONS DATE

SEQUENCE

LAST_NAME

FIRST_NAME

CITY

COUNTRY

STATE

ADDRESS

LAST_NAME

FIRST_NAME

ORG_NAME

CITY

COUNTRY

STATE

ADDRESS


lawyers

SEQUENCE

FIRST_NAME

ORG_NAME

Useful links

The Equivalent_XPath_and_APS_Queries#Query_Equivalences page has example XPath statements

The Reproducible_Patent_Data#Schema_Reconciliation page shows which schemas are associated with which year

The Patent_Data_Extraction_Scripts_(Tool)#Utility_patent_grants_fields pages has examples of nodes and where to find them for utility patents (XML version 4.4, I think).