Patent Schema Reconciliation
Example files
E:\McNair\Projects\SimplerPatentData\data\examples
There are two sets:
- Granted
- Applications
Applications contains just utility and some plant, whereas granted contains design, plant, reissue, and utility patents (i.e., all four types of patents). Both applications and granted have multiple versions (e.g., v4.5, v4.4, v4.3, ..., etc.).
The Task
For both sets (starting with granted), all types, and all versions, we need to identify the xpath (or APS equivalent, see below) for each node.
A node is something like:
- patent number (it shows up as document_id)
- filing number (it also shows up as a document_id in another place)
- grant date
- kind
- type
- applicationnumber
- filingdate
Some nodes are lists of other nodes, for example the assignees node contains multiple assignment records.
Task Notes
Details from Joe Reilly Work Logs (log page)
Added xpaths to both Equivalent XPath and APS Queries, and Patent Schema Reconciliation text.txt in E:\McNair\Projects\SimplerPatentData\data\examples\Patent Schema Reconciliation for the following nodes. In Equivalent XPath and APS Queries, listed examples:
- PATENT_TYPE
- TITLE
- PCT_DOCUMENT_NUMBER
- PATENT_COUNTRY
- PATENT_NUMBER
- PATENT_KIND
- PATENT_GRANT_DATE
- APPLICATION_NUMBER
- APPLICATION_FILING_DATE
- PRIORITY_CLAIMS_DATE
- PRIORITY_CLAIMS_COUNTRY
- PRIORITY_CLAIMS_PATENT_NUMBER
- IPCR_SUBCLASS
- IPCR_MAIN_GROUP
- IPCR_SUB_GROUP
- CPC_SUBCLASS
- CPC_MAIN_GROUP
- CPC_SUB_GROUP
- CLASSIFICATION_NATIONAL_COUNTRY
- CLASSIFICATION_NATIONAL_CLASS
- PRIMARY_EXAMINER_FIRST_NAME
- PRIMARY_EXAMINER_LAST_NAME
- PRIMARY_EXAMINER_DEPARTMENT
Only checked/updated the following nodes in Equivalent XPath and APS Queries, and listed examples. THESE STILL HAVE TO BE ADDED TO Patent Schema Reconciliation text.txt in E:\McNair\Projects\SimplerPatentData\data\examples\Patent Schema Reconciliation.
- NUMBER_OF_CLAIMS
- SEQUENCE
- LAST_NAME
- FIRST_NAME
I haven't looked for or verified any other xpaths for the other nodes, which include:
(Under "Citations"):
ORG_NAME CITY COUNTRY STATE ADDRESS POSTCODE CITATION_DESCRIPTION
and
All listed nodes under "inventors", "assignments", and "lawyers"
Useful links
The Equivalent_XPath_and_APS_Queries#Query_Equivalences page has example XPath statements
The Reproducible_Patent_Data#Schema_Reconciliation page shows which schemas are associated with which year
The Patent_Data_Extraction_Scripts_(Tool)#Utility_patent_grants_fields pages has examples of nodes and where to find them for utility patents (XML version 4.4, I think).