Changes

Jump to navigation Jump to search
===How to run Perl Scripts to extract Patent Data===
1) If you're on a Mac and you'd like to run this from your terminal, you can map the network drive to your computer (instructions can be found under Help for New Staffscroll down to the "Working with the infrastructure and click on the link to "How to Map the networkdrive)2) DonIf you't consult this handy link on re really curious about how to install and run perl programs , see this link (https://www.thoughtco.com/how-to-install-and-run-perl-2641103) because . However, perl is already installed for you to use. Just open a command line and type perl scriptname.plto run a perl program (but you need to get onto the bulk drive and change directories to the appropriate directory to run perl scripts you need to download patent-related data - see below for how to do that).
3) Now it's time to download the data so it can be parsed.
 
For USPTO Assignment Data, there appears
to be a script, under McNair/usptoAssignment, called USPTO_Assignee_Download, which appears
to let a user pass it a text file (file ending in.txt) which contains the url(s) of the assignment data that needs to be downloaded. The script then downloads all the zip files available at that URL. An example called BaseUrls.txt (containing the url that you will probably be using to download the assignment data, unless you're downloading the data from this currrent year, which is in a different link) can be found in McNair/usptoAssignment It then places the
downloaded zip files in "E:/McNair/usptoAssigneeData/name", where "name" is the name of the file. If you want to check
which files have already been processed, check "McNair/usptoAssigneeData/Finished" to see the finished zip files. (In the future, this should be updated, if possible to specify which years to download, since all assignment data that is not from this current year is under one url, and we've already downloaded most of it.)
====For the Main Patent Data====
5) This parser will open an ODBC (or similar) connection to a database on the RDP's installation of postgres. It will then put the data directly into this database. Once complete. we manually move the tables to the dbase server's database (i.e. patent).
 
===For the USPTO Assignment Data===
 
For USPTO Assignment Data, there is a script, under McNair/usptoAssignment, called USPTO_Assignee_Download, which lets a user pass it a text file (file ending in.txt) which contains the url(s) of the assignment data that needs to be downloaded. The script then downloads all the zip files available at that URL. An example called BaseUrls.txt (containing the url that you will probably be using to download the assignment data, unless you're downloading the data from this currrent year, which is in a different link) can be found in McNair/usptoAssignment It then places the downloaded zip files in "E:/McNair/usptoAssigneeData/name", where "name" is the name of the file. If you want to check which files have already been processed, check "McNair/usptoAssigneeData/Finished" to see the finished zip files. (In the future, this should be updated, if possible to specify which years to download, since all assignment data that is not from this current year is under one url, and we've already downloaded most of it.)
== Specifications of USPTO Data To Extract ==

Navigation menu