Edit Project: Collecting SBIR Data

You do not have permission to edit this page, for the following reason:

The action you have requested is limited to users in one of the groups: Users, team.

Has image:
Has title:
Has owner:
Has start date:
Has deadline date:
Has keywords:
Has project output:	Tool Data Content How-to Guide
Has project status:
Is dependent on:
Does subsume:
Has sponsor:
Has file locations:

Free text:

==Manual Collection== Files are in: E:\McNair\Projects\SBIR Each file is a group of 1000 companies. Each group of 1000 is numbered sequentially. ==Rough notes== *Get the data from https://www.sbir.gov/sbirsearch/award/all *Built a Selenium Web Driver which is stored in E:\McNair\Software\Scripts\Selenium Web Drivers *Does not work because there is a captcha that must be entered after selecting xls download ==Notes on building a Selenium Web Driver:== In your python script: *Make sure that you properly set the chromedriver path if you don't have it under root. For example: webdriver.Chrome("/Users/adriansmart/PycharmProjects/SeleniumTest/chromedriver") *Use driver.find_element_by_xpath to select the element on html page. You will need to enter the xpath in this function so first load the website in a browser. *Next, right click on the page element you want the xpath and select inspect. This will launch the html inspector and highlight the relevant lines of code *Right click on what looks like the right piece of code and select "Copy xpath data" *Paste that stuff in your python script where it asks for a path, For example: driver.find_element_by_xpath("//*[@id='solr-print-dropdown-button']") = SBIR Concatenation = ==Objective== The objective of this project was to concatenate 162 xlsx files into one large tab delimited text file <br> ==Script== The python script can be found here: E:\McNair\Projects\SBIR\concat_excel.py The resulting file is located here: E:\McNair\Projects\SBIR\SBIR.txt

Summary:

This is a minor edit Watch this page

Cancel

Edit Project: Collecting SBIR Data

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools