Changes

Jump to navigation Jump to search
3,821 bytes added ,  17:58, 10 April 2022
{{Project|Has project output=Data|Has sponsor=McNair ProjectsCenter
|Has title=Federal Grant Data
|Has owner=Catherine Kirby, Jeemin Sim, Avesh Krishna,
|Has keywords=Data
|Has project status=Active
|Does subsume=NSF Data, NIH Data,
}}
==General==<onlyinclude>The [[Federal Grant Data]] project collects and processes [[NIH Data]], [[NSF Data]], and other federal grant information from structured government sources and imports it into a relational database for use. See also: The [[Trial Data Project]] and the [[FDA Trials Data]] project.</onlyinclude>
Finding ==Latest Update== In April 2019, Ed updated the NSF and NIHdata. ===NSF=== The update process was:*Visit https://www.nsf.gov/awardsearch/download.jsp*Retrieve 2017, 2018, and 2019 zips to E:\projects\grants\NSF*Extract them to E:\projects\grants\NSF\XML*Remove 3 bad XML files that cause the parser to crash*Move the whole thing back to E:\mcnair\Projects\Federal Grant Data\NSF\NSF Extracted Data*Fix up and run Jeemin_NSF_XML_Parser.py, so that it also takes ProgramElement (see below) and make:***Run LoadNSF.sql to produce tables NSFGeneral, NSFInvestigator and NSFInstitution in dbase '''grants''' Note that '''program_code'''s are available from *STTR: https://www.nsf.gov/awardsearch/lookup?type=program&qrytxt=*STTR**SBIR: https://www.nsf.gov/awardsearch/lookup?type=program&qrytxt=*SBIR*  Combined SBIR/STTR codes '168E','5370','169E','166E','165E','164E','5371','163E','167E','5151','5727','4804','Y052','1532','Y813','2282','6537','Y350','4645','9311','2266','5370','168E','1591','1505','Z408'  SBIR Codes: '168E','5370','169E','166E','165E','164E','5371','163E','167E','5151','5727','4804','Y052','1532','Y813','2282','6537','Y350','4645','9311','2266' STTR Codes: '5370','168E','1591','1505','Z408' Unfortunately, we need the Program Element code, shown below, but our extractor doesn't currently pull it.  <xsd:element maxOccurs="unbounded" name="ProgramElement"> <xsd:complexType> <xsd:sequence> <xsd:element name="Code" type="xsd:int"/> <xsd:element name="Text" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> Correction: This appears fixed. There is a programelementcode in nsfgeneral. However, the code is XXXX in 42.6% of cases. ===NIH=== The 2018 update process was:*Visit https://exporter.nih.gov/ExPORTER_Catalog.aspx*Retrieve 2016, 2017, and other federal grant 2018 csvs to E:\projects\grants\NIH\Source Files*Combine them into RePORTER_PRJ_C_FY2016-17-18.csv, open in excel and save as tdt*Add old data files to same directory*Run LoadNIH.sql to product table nih in dbase '''grants''' Supporting info for NIH grants:*A datadictionary for the NIH grants is here: https://exporter.nih.gov/about.aspx*'''Activity''' Codes are here: https://grants.nih.gov/grants/funding/ac_search_results.htm*STTR codes are: R41,R42,UT1,UT2*SBIR codes are: R43,R44,U43,U44*SBIR/STTR Joint codes: RC3,SB1,UB1,UC3*'''Total_Cost''' is only available in certain instances but is the best cost variable. The 2022 update process was:*Visit https://exporter.nih.gov/ExPORTER_Catalog.aspx*Retrieve 2019, 2020, and 2021 csvs to E:\projects\grants\NIH\Source Files*Open each in excel, copy and paste to a tab-delimited text file.*Remove quotes from the file using regex (in textpad) and save as UTF8*Run clean.py to fix the bad records*Run LoadNIH.sql to load to the grants database (put input files in z:/bulk/grants) Example processing code (for selected stats for Vermont): CREATE TABLE nih_vt_stats AS SELECT extract(year from award_notice_date) as year, direct_cost_amt+indirect_cost_amt as sum_amounts, total_cost, full_project_num, CASE WHEN direct_cost_amt+indirect_cost_amt>total_cost THEN direct_cost_amt+indirect_cost_amt ELSE total_cost END AS amount, CASE WHEN org_name LIKE 'UNIVERSITY OF VERMONT%' THEN 1 ELSE 0 END AS uvm FROM nih WHERE org_state='VT'; --Updated Rows 9493 SELECT year, uvm, sum(amount), count(DISTINCT full_project_num) FROM nih_vt_stats GROUP BY year, uvm ORDER BY uvm,year;
==grants.gov==

Navigation menu