[[Reproducible_Patent_Data#Schema_Reconciliation|Current Status]] for Granted Patents implementation
== How it works currently ==
Only has the intelligence from <code>E:\McNair\PatentData\Processed\xmlparser_4.5_4.4_4.3.pl</code>.
That is to say, it only explicitly covers versions 4.3, 4.4, and 4.5 utility patents.
Plant, reissue, and design patents contain a nonempty intersection with these attributes but have their own quirks.
The actual code to do this XML parsing is at <code>E:\McNair\Projects\SimplerPatentData\src\main\java\org\bakerinstitute\mcnair\uspto_granted\XmlParser.java</code> with the end goal to create an in-memory representation of a granted patent as a <code>E:\McNair\Projects\SimplerPatentData\src\main\java\org\bakerinstitute\mcnair\models\GrantedPatent.java</code> data structure.
To learn about the fields that a model contains, look at the class Model.Metadata which should implement interface <code>TableMetadata</code>. In particular, this bunch of constant data describes the mapping of enum fields to table column names as well as the types of columns.
For an example, we can tell that <code>GrantedPatent</code> is a struct that contains the following data:
<nowiki>
private final Map<GrantedPatent.Fields, String> strings;
private final Map<GrantedPatent.Fields, Double> numbers;
private final List<Citation> citations;
private final List<Sciref> scirefs;
private final List<Inventor> inventors;
private final List<AssignmentSummary> assignments;
private final List<Lawyer> lawyers;</nowiki>
== Query Equivalences ==
Note: xpaths are for all design types (utility, reissue, plant, design) unless otherwise noted. Only granted patents have been checked so far.
Notes to remember:
All the text under each text file in E:\McNair\Projects\SimplerPatentData\data\examples\granted appears under the following lines of text, indicating that every XML should perhaps include these headers (the headers aren't closed anywhere in the text files in E:\McNair\Projects\SimplerPatentData\data\examples\granted; for example, there's no </us-patent-grant> towards the end of the text file.
<nowiki>
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE us-patent-grant SYSTEM "us-patent-grant-v45-2014-04-03.dtd" [ ]>
<us-patent-grant</nowiki>
Ignoring the above three lines for every text file (changes to XMLs can be made later with RegEx), the XMLs for ''sequence'', and ''last-name'', respectively, from the lines of code below it are:
Also note: the lack of a certain entry for a certain version and type simply means that it was missing from the single example located in the E:\McNair\Projects\SimplerPatentData\data\examples\Patent Schema Reconciliation\granted file. The presence or lack of a certain entry could mean either that it is missing systematically (and would presumably be missing for all other examples of that version and type), or simply that that single example, in contrast to other examples for that version and type, lacked an entry.
***examples: W. Kordes' Söhne Rosenschulen GmbH & Co KG [this is one example, included to show the variety of characters/symbols included in orgnames]
*** <code>INVT:NAM</code> (whole name, might be of an individual)
****Photograph that is representative of an angle-cut tip for a continuous geared hinge that was publicly used or on sale more than one (1) year prior to the filing date of U.S. Appl. No. 61/231,249, filed Aug. 4, 2009.
** XML 4.0 reissue type, 4.1 plant type, 4.2 all types
Note: foreign countries had no listed state; territories or states were included in the city listing. For the sake of time these haven't been checked for being domestic; presumably some of these versions lack ''state'' entries.
** XML v4.5 types utility, plant, reissue; v4.4 plant, reissue, design types; v4.3 all types; v4.2 reissue, plant types; v4.1 all types; v4.0 utility and reissue types.
No address data exists for any versions or examples:
* '''CITY'''
** XML 4.3, 4.4, 4.5
*** ?
** APS
*** <code>LREP:CTY</code>
* '''COUNTRY'''
** XML 4.3, 4.4, 4.5
*** ?
** APS
*** <code>LREP:CNT</code>
* '''STATE'''
** XML 4.3, 4.4, 4.5
*** ?
** APS
*** <code>LREP:STA</code>
* '''ADDRESS'''
** XML 4.3, 4.4, 4.5
*** ?
** APS
*** <code>LREP:STR</code>
* '''POSTCODE'''
** XML 4.3, 4.4, 4.5
*** ?
** APS
*** <code>LREP:ZIP</code>
===citations header===
==Applications==
*The only examples in the applications folder were plant type v4.2-v4.4, and utility type v4.0-v4.4. Two files with v1.5 and v1.6 (of unknown type) were ignored.
== Examples ==
* APS
The first APS entry of <code>E:\McNair\Projects\SimplerPatentData\data\extracts\granted\pftaps19760106_wk01.txt</code> as a GrantedPatent is
extras: {class org.bakerinstitute.mcnair.models.Citation=[], class org.bakerinstitute.mcnair.models.Sciref=[], class org.bakerinstitute.mcnair.models.Inventor=[strings: {STATE=MI, ORG_NAME=Widenhofer; James W., COUNTRY=null, CITY=Jackson, ADDRESS=null, POSTCODE=null, CITING_PATENT=RE0286710}
integers: {}, strings: {CITING_PATENT=D0774723, CITATION_DESCRIPTION=Lazure | 2014: A bleeding edge effort in chips . . . , posted on Dec. 11, 2014, no copyright date posted [online], [site visited Aug. 17, 2016]. Available from Internet, <URL: https://lazure2.wordpress.com>.}