Changes

Reproducible Patent Data (view source)

Revision as of 12:19, 9 June 2017

263 bytes removed , 12:19, 9 June 2017

no edit summary

}}

A continuation of [[Redesigning Patent Database]] that aims to write faster, more centralized code to deal with data from the United States Patent and Trademark Office (USPTO). By having an end-to-end pipeline we can easily reproduce or update data without worrying about unintentional side effects or missing data. Currently, it succeeds in bulk downloading from the USPTO; streaming file splitting, that is, splitting large concatenated files into their component parts in-memory; and parsing of XML to Java objects, APS to Java Maps, and maintenance fee data to Java objects.

== Progress ==

OliverC

Bots, Bureaucrats, Administrators (Semantic MediaWiki), Administrators

329

edits

Changes

Reproducible Patent Data (view source)

Revision as of 12:19, 9 June 2017

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools