Changes

Reproducible Patent Data (view source)

Revision as of 13:44, 20 June 2017

641 bytes added , 13:44, 20 June 2017

#* Aim to create a completely naive schema with as few constraints as possible--iteratively add more constraints in the future

~~It takes about 3~~Since writing raw SQL is a bit cumbersome and error-~~5 minutes~~ prone, I have added some abstraction layers that make it much easier to quickly add ~~each batch of patents~~bulk data. By using Postgres's <code>CopyManager</code> class, we buffer SQL copy commands in memory (as many as possible) and then flush these rows. ~~That is~~To understand how the abstraction layers work, see the ~~naive serial implementation with~~ code in <code>~~COPY~~E:\McNair\Projects\SimplerPatentData\src\main\java\org\bakerinstitute\mcnair\postgres</code> ~~operates at about 1000 patents per minute along with citations~~. For a concrete example, ~~assignors~~see <code>E:\McNair\Projects\SimplerPatentData\src\main\java\org\bakerinstitute\mcnair\uspto_assignments\GeonamesZips.java</code> for a simple, ~~lawyers, etc~~self-contained example or <code>E:\McNair\Projects\SimplerPatentData\src\main\java\org\bakerinstitute\mcnair\models\GrantedPatent.java</code> for an example of how to extend the abstraction layer to deal with more complex scenarios.

== Address Data ==

OliverC

Bots, Bureaucrats, Administrators (Semantic MediaWiki), Administrators

329

edits

Changes

Reproducible Patent Data (view source)

Revision as of 13:44, 20 June 2017

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools