There has been lots of work on storing information about the patents in databases, including methods of cleaning the data, what data should be included, etc. Some of it is obsolete and some of it is incorrect. Generally, the newer pages are going to be the most relevant, but it can be helpful to see what is done in the past, especially since some methodology (like the cleaning the data) hasn't changed that much.
==Joe's Work==
Work (likely finished): Identified paths within the XML examples for utility, reissue, plant, and design patents, for versions 4.0-4.5, from E:\McNair\Projects\SimplerPatentData\data\examples\granted. Only the granted folder was done. Initially, some xpaths were saved in E:\McNair\Projects\SimplerPatentData\data\examples\Patent Schema Reconciliation as a text file also. Paths identified for the following nodes, from http://mcnair.bakerinstitute.org/wiki/Equivalent_XPath_and_APS_Queries, for all types and versions: