Changes

Jump to navigation Jump to search
where arg1 and arg2 are arguments passed to the script either directly or by name (e.g., argument1=)
====For the Main Patent Data====
You'll need to run a series of scripts:
#splitter.pl to break the files into individual xml files
=====USPTO_Parser.pl=====
USPTO_Parser.pl can be found under
Note the zip files should appear briefly (sequentially) in E:/McNair/Software/Scripts/Patent before disappearing and reappearing unzipped in E:/McNair/PatentData
=====splitter.pl=====
Now we need to split the files into individual, valid xml files. To do this:
=====xmlparser_4.5_4.4_4.3.pl=====
The next step is to parse the actual files. '''Do not use the perl script PatentParser.pl'''. This script is out of date.
*I also confirmed that this was the highest patent number with the highest grant data in the dbase PatentDB
*I therefore put every folder from ipg160329 to ipg161227 into the E:\McNair\PatentData\Processed\2016 folder
*The script (xmlparser_4.5_4.4_4.3.pl) required substantial modification. In particular, the data structure implied by E:\McNair\Software\Scripts\Patent\createTables.sql was inadequate. As a consequence, I created a new database called PatentUpdate and modified its structure. All of the additional data is now loaded into there.
====Next steps====
=We now need to:#Retrieve the data out of PatentUpdate and reprocess it so that it will fit into ''patent'' on the main db server ===For the USPTO Assignment Data=== ====Downloading the data====
For USPTO Assignment Data, there is a script, under McNair/usptoAssignment, called USPTO_Assignee_Download, which lets a user pass it a text file (file ending in.txt) which contains the url(s) of the assignment data that needs to be downloaded. The script then downloads all the zip files available at that URL. An example called BaseUrls.txt (containing the url that you will probably be using to download the assignment data, unless you're downloading the data from this currrent year, which is in a different link) can be found in McNair/usptoAssignment It then places the downloaded zip files in "E:/McNair/usptoAssigneeData/name", where "name" is the name of the file. If you want to check which files have already been processed, check "McNair/usptoAssigneeData/Finished" to see the finished zip files. (In the future, this should be updated, if possible to specify which years to download, since all assignment data that is not from this current year is under one url, and we've already downloaded most of it.)
while it parses the file.
====For the USPTO Maintenance Fee Data====
Download the file manually from https://bulkdata.uspto.gov/data2/patent/maintenancefee/and replace existing file in McNair/Patent Data/Maintenance Fee Data.

Navigation menu