# <del>Create tooling for minions</del> ''skipped''
# <del>Investigate parallel speedup (e.g. multithread, mmap)</del> ''done''
# <del>Remove duplicate code through the addition of more abstract classes</del> ''done''# <del>first 5 zipcode; centroid?</del> ''hackily done''# <del>patent id</del> ''doneish''# <del>Create XPath queries for reissue, design patents (only utility right now)</del> ''split off'' (see [[http://mcnair.bakerinstitute.org/wiki/Equivalent_XPath_and_APS_Queries]])# <del>Create semantic parser for APS files</done> ''see above''
# Data Cleanup (reference [[Patent_Assignment_Data_Restructure|Marcela and Sonia's work]])
# Data Source Merger (''only USPTO granted, maintfee, assignment'' not USPTO applications or Harvard Dataverse or Lex Machina currently)
# Add constraints to database tables, e.g. correct types, foreign keys, not null, lookup tables
# Add deduplication
# Remove duplicate code through the addition of more abstract classes
# first 5 zipcode; centroid
# patent id
== Directory Layout ==