Changes

Jump to navigation Jump to search
no edit summary
==Project Goal==
The goal of this project is to find good "Demo Day" candidate web pages and to submit these pages to Amazon Mechanical Turk for data collecting. A good candidate is defined as a page containing a list of cohort companies associated with an accelerator. Through observation, good candidates usually containing time and location information about the demo day as welland thus is sufficient to be pushed to MTurk to collect data.
==Terms and Definition==
===Demo Day Page===
==Code Location==
The source code and relevant files for the project can be found here:
==The Crawler Functionality==
The crawler functionality is stored in the file:
STEP1_crawl.pyThe crawler was optimized for improved speed, improved performance and improved filtration while remain functional over the large set of data. 
BUG REPORT by Maxine Tao (FIXED): update the crawler with this line of code:
search_results = driver.find_elements_by_xpath("//div[@class='g']/div/div/div/h3/a") + driver.find_elements_by_xpath("//div[@class='g']/div/div/h3/a")
197

edits

Navigation menu