Difference between revisions of "Collecting SBIR Data"
Jump to navigation
Jump to search
| Line 8: | Line 8: | ||
==Rough notes== | ==Rough notes== | ||
| − | *Get the data from | + | *Get the data from https://www.sbir.gov/sbirsearch/award/all |
| + | *Built a Selenium Web Driver which is stored in | ||
| + | *Does not work because there is a captcha that must be entered after selecting xls download | ||
| + | |||
| + | Notes on build a Selenium Web Driver: | ||
| + | *Make sure that you properly set the chromedriver path if you don't have it under root. For example: webdriver.Chrome("/Users/adriansmart/PycharmProjects/SeleniumTest/chromedriver") | ||
| + | *Use driver.find_element_by_xpath to locate element on html page | ||
| + | *To get xpath from html, first load the website | ||
| + | *Right click on the page element you want the xpath and select inspect. This will launch the html inspector and highlight the relevant lines of code | ||
| + | *Right click on what looks like the right piece of code and select "Copy xpath data" | ||
| + | *Paste that stuff in your python script where it asks for a path, For example: driver.find_element_by_xpath("//*[@id='solr-print-dropdown-button']") | ||
Revision as of 17:04, 6 June 2017
| Collecting SBIR Data | |
|---|---|
| Project Information | |
| Project Title | Collecting SBIR Data |
| Owner | Adrian Smart |
| Start Date | June 6, 2017 |
| Deadline | |
| Keywords | Data, Tool |
| Primary Billing | |
| Notes | |
| Has project status | Active |
| Copyright © 2016 edegan.com. All Rights Reserved. | |
Rough notes
- Get the data from https://www.sbir.gov/sbirsearch/award/all
- Built a Selenium Web Driver which is stored in
- Does not work because there is a captcha that must be entered after selecting xls download
Notes on build a Selenium Web Driver:
- Make sure that you properly set the chromedriver path if you don't have it under root. For example: webdriver.Chrome("/Users/adriansmart/PycharmProjects/SeleniumTest/chromedriver")
- Use driver.find_element_by_xpath to locate element on html page
- To get xpath from html, first load the website
- Right click on the page element you want the xpath and select inspect. This will launch the html inspector and highlight the relevant lines of code
- Right click on what looks like the right piece of code and select "Copy xpath data"
- Paste that stuff in your python script where it asks for a path, For example: driver.find_element_by_xpath("//*[@id='solr-print-dropdown-button']")