You do not have permission to edit this page, for the following reason:
The action you have requested is limited to users in one of the groups: Users, team.
<onlyinclude>The [[Federal Grant Data]] project collects and processes [[NIH Data]], [[NSF Data]], and other federal grant information from structured government sources and imports it into a relational database for use. See also: The [[Trial Data Project]] and the [[FDA Trials Data]] project.</onlyinclude>
In April 2019, Ed updated the NSF and NIH data.
The update process was:
*Retrieve 2017, 2018, and 2019 zips to E:\projects\grants\NSF
*Extract them to E:\projects\grants\NSF\XML
*Remove 3 bad XML files that cause the parser to crash
*Move the whole thing back to E:\mcnair\Projects\Federal Grant Data\NSF\NSF Extracted Data
*Fix up and run Jeemin_NSF_XML_Parser.py, so that it also takes ProgramElement (see below) and make:
*Run LoadNSF.sql to produce tables NSFGeneral, NSFInvestigator and NSFInstitution in dbase '''grants'''
Note that '''program_code'''s are available from
Combined SBIR/STTR codes
Unfortunately, we need the Program Element code, shown below, but our extractor doesn't currently pull it.
<xsd:element maxOccurs="unbounded" name="ProgramElement">
<xsd:element name="Code" type="xsd:int"/>
<xsd:element name="Text" type="xsd:string"/>
Correction: This appears fixed. There is a programelementcode in nsfgeneral. However, the code is XXXX in 42.6% of cases.
The 2018 update process was:
*Retrieve 2016, 2017, and 2018 csvs to E:\projects\grants\NIH\Source Files
*Combine them into RePORTER_PRJ_C_FY2016-17-18.csv, open in excel and save as tdt
*Add old data files to same directory
*Run LoadNIH.sql to product table nih in dbase '''grants'''
Supporting info for NIH grants:
*A data dictionary for the NIH grants is here: https://exporter.nih.gov/about.aspx
*'''Activity''' Codes are here: https://grants.nih.gov/grants/funding/ac_search_results.htm
*STTR codes are: R41,R42,UT1,UT2
*SBIR codes are: R43,R44,U43,U44
*SBIR/STTR Joint codes: RC3,SB1,UB1,UC3
*'''Total_Cost''' is only available in certain instances but is the best cost variable.
The 2022 update process was:
*Retrieve 2019, 2020, and 2021 csvs to E:\projects\grants\NIH\Source Files
*Open each in excel, copy and paste to a tab-delimited text file.
*Remove quotes from the file using regex (in textpad) and save as UTF8
*Run clean.py to fix the bad records
*Run LoadNIH.sql to load to the grants database (put input files in z:/bulk/grants)
Example processing code (for selected stats for Vermont):
CREATE TABLE nih_vt_stats AS
SELECT extract(year from award_notice_date) as year, direct_cost_amt+indirect_cost_amt as sum_amounts, total_cost, full_project_num,
CASE WHEN direct_cost_amt+indirect_cost_amt>total_cost THEN direct_cost_amt+indirect_cost_amt ELSE total_cost END AS amount,
CASE WHEN org_name LIKE 'UNIVERSITY OF VERMONT%' THEN 1 ELSE 0 END AS uvm
FROM nih WHERE org_state='VT';
--Updated Rows 9493
SELECT year, uvm, sum(amount), count(DISTINCT full_project_num)
FROM nih_vt_stats GROUP BY year, uvm ORDER BY uvm,year;
Lists agency, estimated funding, opportunity title. opportunity number, and the date. Can be used to look up who was awarded the grants. I downloaded all grants offered by the NIH since 2005 (when the data was available)
E:\McNair\Projects\Federal Grant Data\NIH
==NSF Grant Data==
The NSF grant data are in folders by year, each grant is an xml file. The format of a typical file is shown below. There is data available from 1959 onwards, and full records after 1976. In 2016 there were ~10,000 awards.
Each year can be downloaded from: [https://www.nsf.gov/awardsearch/download.jsp https://www.nsf.gov/awardsearch/download.jsp]
Data is located in:
E:\McNair\Projects\Federal Grant Data\NSF
Sample XML schema
Collaborative Research: Hybrid Organic-Inorganic Thermoelectric Materials
<LongName>Directorate For Engineering</LongName>
<LongName>Div Of Civil, Mechanical, & Manufact Inn</LongName>
<SignBlockName>Thomas F. Kuech</SignBlockName>
Thermoelectric materials are materials which can be used to convert thermal energy directly to electricity. The performance of a thermoelectric material is measured by the "figure of merit", termed ZT. There has been much research into increasing thermoelectric materials, figure of merit, however, progress in this area has been slow and most of the researched thermoelectric materials up to now are suffering from either high fabrication cost, usage of rare earth or toxic elements, or poor mechanical properties. Organic thermoelectric materials (OTEs) have recently attracted attention for low temperature applications (< 300K), especially cooling purposes, as they are flexible, low-cost and abundant, and low-cost fabrication methods for synthesizing them exist... A large class of semiconducting nanostructures (e.g. Si, CdTe, Bi, and PbTe nanowires and holely structures) combined with conjugated polymers (e.g., chemically-modified PEDOT and low bandgap polymers) and organic molecules (specifically charged chemical species attached to molecules such as CF3- substituted styrene molecules) will be simulated, synthesized and optimized to identify new hybrid materials with a potentially high ZT.
<Name>University of Virginia Main Campus</Name>
<StreetAddress>P.O. BOX 400195</StreetAddress>
<Text>Materials Eng. & Processing</Text>
==Survey of Federal Funds for Research and Development Fiscal Years 2014–16==
Excel spreadsheets with data regarding Federal Agency budgets and spending on R&D [https://ncsesdata.nsf.gov/fedfunds/2014/]
E:\McNair\Projects\Federal Grant Data\NSF\NSF Data Tables
NIH grants for a given year can be downloaded from [https://report.nih.gov/award/index.cfm https://report.nih.gov/award/index.cfm]
This does not specify what the grant was for, but it does list the amount and organization who received the grant. Additionally, the city and state of the organization are included. The number of awards received by each organization in a given year is listed as well. Excel Tables can be downloaded for each each year at this site. I downloaded the table for 2016 as a sample.
CK: It is possible to get data. https://exporter.nih.gov/ExPORTER_Catalog.aspx
E:\McNair\Projects\Federal Grant Data\NIH
DOD grant data folder (compiled for Augusta Ecosystem project) was copied from E:\McNair\Projects\Innovation Districts\Augusta Startup Ecosystem to E:\McNair\Projects\Federal Grant Data. It contains yearly data that can be filtered, and a few files on Augusta-related grants.
This is a minor edit
Watch this page