Changes

Jump to navigation Jump to search
no edit summary
A continuation of [[Redesigning Patent Database]] that aims to write faster, more centralized code to deal with data from the United States Patent and Trademark Office (USPTO). By having an end-to-end pipeline we can easily reproduce or update data without worrying about unintentional side effects or missing data.
 
== Quickstart ==
 
To get up and running with the code, do the following:
 
# Clone the git project (link at end of page) to your user directory
# Launch IntelliJ with >= Java 8 and Maven configured (default installed version on RDP works)
# Open project in IntelliJ
# Create an empty database (see [[#Database]])
# Run the table creation scripts in <code>src/db/schemas/</code> in your new database
# Run the Driver scripts in IntelliJ with the correct value for <code>DATA_DIRECTORY</code>
# [Take a really, really long lunch...in total should take no more than five hours to load data on RDP]
# Run scripts in <code>src/db/constraints</code> to check data assumptions
# That's it!
== Directory Layout ==

Navigation menu