Changes

Jump to navigation Jump to search
no edit summary
{{Project
|Has project output=Tool
|Has sponsor=McNair Center
|Has title=Patent Data Extraction Scripts (Tool)
|Has owner=Marcela Interiano,
|Has project status=Subsume
|Has keywords=Tool
}}
 
===Patent applications===
 
Note that our application data appears to be ONLY utility patents, except for a few plant patents.
 
At the top level, in spec 4.0 (and presumably others) there are:
<us-patent-application lang="EN" dtd-version="v4.0 2004-12-02" file="US20050000001A1-20050106.XML"
status="PARALLEL-RUN" id="us-patent-application" country="US" date-produced="20041222" date-publ="20050106">
<us-bibliographic-data-application lang="EN" country="US">
...
</us-bibliographic-data-application>
<abstract id="abstract">
</abstract>
<drawings id="DRAWINGS">
</drawings>
<description id="description">
<?summary-of-invention description="Summary of Invention" end="lead"?>
<?summary-of-invention description="Summary of Invention" end="tail"?>
<?brief-description-of-drawings description="Brief Description of Drawings" end="lead"?>
<?brief-description-of-drawings description="Brief Description of Drawings" end="tail"?>
<?detailed-description description="Detailed Description" end="lead"?>
<?detailed-description description="Detailed Description" end="tail"?>
</description>
<claims id="claims">
</claims>
</us-patent-application>
 
We are currently processing only:
<us-bibliographic-data-application lang="EN" country="US">
...
</us-bibliographic-data-application>
 
===Utility patent grants fields===
The XML files for patent data are available at
*https://bulkdata.uspto.gov/
*http://patents.reedtech.com/patent-products.php
 
Patent data up to year 2015 can also be obtained from https://www.google.com/googlebooks/uspto-patents.html. This repository is no longer updated.
 
Each XML file contains, in order, sorted by document ID:
#Design patents
#Plant patents
#Reissues
#Utility patents
 
====Overview====
 
DESIGN Patents:
 
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE us-patent-grant SYSTEM "us-patent-grant-v45-2014-04-03.dtd" [ ]>
<us-patent-grant lang="EN" dtd-version="v4.5 2014-04-03" file="USD0774273-20161220.XML"
status="PRODUCTION" id="us-patent-grant" country="US" date-produced="20161205" date-publ="20161220">
<us-bibliographic-data-grant>
</us-bibliographic-data-grant>
<drawings id="DRAWINGS">
</drawings>
<description id="description">
<?brief-description-of-drawings description="Brief Description of Drawings" end="lead"?>
<description-of-drawings>
</description-of-drawings>
<?brief-description-of-drawings description="Brief Description of Drawings" end="tail"?>
</description>
<us-claim-statement>CLAIM</us-claim-statement>
<claims id="claims">
</claims>
</us-patent-grant>
 
====Patent====
</assignees>
<onlyinclude>  For further information on Assignee data from the USPTO, see [[USPTO Assignees Data]]. ====Other things we might wantFields with Potential====
*Abstract
*Claims (other than their count)
</onlyinclude>
====Things we don't need====
*SymbolPosition, ClassificationValue - we likely don't need them
*Classification status and data source - no idea what these do
</onlyinclude>
====About the scripts====
I have also downloaded all of them on to the database server and can be found by
cd /bulk/patent
 
[[Category:Patent]]

Navigation menu