Changes

Jump to navigation Jump to search
114 bytes added ,  18:45, 13 May 2016
Splitter.pl will split those concatenated xmls into individual xmls into:
\\father\bulk\PatentData\Processed
Note: The ByYear (2010-2016) folders are for convenience (the XMLs inside them are post-processed to deal with genome sequences)
xmlparser_4.5_4.4_4.3.pl is the script that processes the xmls given the path where the xmls are stored. This script is located at
\\father\bulk\PatentData\Processed
It should be run as
xmlparser_4.5_4.4_4.3.pl '\\father\bulk\PatentData\Processed\2010'
This will process all the xmls present in the 2010 directory and stor store them in the database.The database connection string is hard coded for now inside the script. The database name is patentDB. This database is (located in the postgres installation of the RDP server.)
==Fields of Interest==
Anonymous user

Navigation menu