==General==Working with <onlyinclude>The [[Federal Grant Data]] project collects and processes [[NIHData]], [[NSFData]], and other federal grant datainformation from structured government sources and imports it into a relational database for use. See also: The [[Trial Data Project]] and the [[FDA Trials Data]] project.</onlyinclude>
*Retrieve 2017, 2018, and 2019 zips to E:\projects\grants\NSF
*Extract them to E:\projects\grants\NSF\XML
*Remove 3 bad XML files that cause the parser to crash
*Run Move the whole thing back to E:\mcnair\Projects\Federal Grant Data\NSF\NSF Extracted Data*Fix up and run Jeemin_NSF_XML_Parser.py*Add new data to old, removing dups so that it also takes ProgramElement (see below) and creating three filesmake:**general.txt**institutions.txt**investigators.txt*Run LoadNSF.sql to to product produce tables NSFGeneral, NSFInvestigator and NSFInstitution in dbase '''grants'''
Note that '''program_code'''s are available from *STTR: https://www.nsf.gov/awardsearch/lookup?type=program&qrytxt=*STTR**SBIR: https://www.nsf.gov/awardsearch/lookup?type=program&qrytxt=*SBIR* Combined SBIR/STTR codes '168E','5370','169E','166E','165E','164E','5371','163E','167E','5151','5727','4804','Y052','1532','Y813','2282','6537','Y350','4645','9311','2266','5370','168E','1591','1505','Z408' SBIR Codes: '168E','5370','169E','166E','165E','164E','5371','163E','167E','5151','5727','4804','Y052','1532','Y813','2282','6537','Y350','4645','9311','2266' STTR Codes: '5370','168E','1591','1505','Z408' Unfortunately, we need the Program Element code, shown below, but our extractor doesn't currently pull it. <xsd:element maxOccurs="unbounded" name="ProgramElement"> <xsd:complexType> <xsd:sequence> <xsd:element name="Code" type="xsd:int"/> <xsd:element name="Text" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> Correction: This appears fixed. There is a programelementcode in nsfgeneral. However, the code is XXXX in 42.6% of cases. ===NIH=== The 2018 update process was: