Changes

423 bytes added , 13:41, 21 September 2020

no edit summary

{{Project|Has project output=Data|Has sponsor=McNair ~~Projects~~Center

|Has title=USPTO Patent Assignment Dataset

|Has owner=Ed Egan,

LoadUSPTOPAD.sql

To get the data into ~~UTF-8~~ASCII or ASCII, move it to the dbase server then:

*Check its encoding using:

file -i Car.java

*Convert it to UTF-8 using (the TRANSLIT option approximates characters that can't be directly encoded)

iconv -f oldformat -t UTF-8//TRANSLIT file -o outfile

*~~A bash script~~ *The sc options forces iconv to ignore bad chars and move on: iconv -sc -f oldformat -t UTF-8//TRANSLIT file -o outfile*Bash scripts to do all of the csvs is in Z:\USPTO_assigneesdata; make it them executable and then run itwhichever you need

chmod +x encoding.sh

./encoding.sh

*Note that the final source encoding was Win1252 and the final target encoding was ASCII

*All bar three of the files had to be manually fixed to remove errors. Final files are in E:\McNair\Projects\USPTO Patent Assignment Dataset

Ed

Bureaucrats, Interface administrators, Administrators (Semantic MediaWiki), Administrators

7,612

edits

Changes

USPTO Patent Assignment Dataset (view source)

Revision as of 13:41, 21 September 2020

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools