Changes

Jump to navigation Jump to search
#Log on to the RDP
#Open powershell
#Change directory to wherever the script is located by doing: cd e: cd /mcnair/whatever
#Run the script by doing:
perl scriptname arg1 arg2 ...
====For the Main Patent Data====
The equivalent for patent data is called "You'll need to run a series of scripts:#USPTO_Parser" .pl to get the zip files from the USPTO and unzip them#splitter.pl to break the files into individual xml files =====USPTO_Parser.pl===== USPTO_Parser.pl can be found under
E:/McNair/Software/Scripts/Patent
Note the zip files should appear briefly (sequentially) in E:/McNair/Software/Scripts/Patent before disappearing and reappearing unzipped in E:/McNair/PatentData
3b) =====splitter.pl===== Now we need to split the files into individual, valid xml files. To do this:
Move the files to be split into E:/McNair/PatentData/Queue
Run the command:
Each file will then be blown out into a directory of xml files in E:/McNair/PatentData/Processed
4) =====xmlparser_4.5_4.4_4.3.pl===== The next step would be is to parse the actual files.
For the patent data files, based on the existing documentation, it looks like PatentParser, found in McNair/Software/Scripts/Patent, has to be run on each xml file that was downloaded and unzipped during the previous step.
 
(For future updates to the perl files, we should update this script so that it can be run on a directory of files like the parser for USPTO assignment data).
*It then stores the parsed xml files all in a text file called "Results.txt" (which I assume will have to be deleted afterward). This script utilizes the Claim.pm, Inventor.pm, PatentApplication.pm, and Loader.pm modules.

Navigation menu