Changes

Jump to navigation Jump to search
no edit summary
==Current Work==
 
Note that TFTRTA-AcceleratorFinal.txt in E:\projects\accelerators was updated to included all creation dates and dead dates. This is not reflected below, except that the script had it its load SQL updated too.
===Load the existing data===
We fixed up and ran E:\projects\accelerators\Google\DemoDayCrawler.py
This script was based on E:\mcnair\Software\Accelerators\DemoDayCrawler.py, rather than the more recent E:\mcnair\Projects\Accelerator Demo Day\Test Run\STEP1_crawl.py
 
The output is:
*E:\projects\accelerators\Google\Results.txt 2515
*E:\projects\accelerators\Google\Results folder containing html
Previously run Google search results are in:
*5 results per accelerator -- E:\mcnair\Software\Accelerators\demoday_crawl_full.txt2777*10 results per accelerator -- E:\mcnair\Projects\Accelerator Demo Day\Test Rundemoday_crawl_full_from_testrunRun\demoday_crawl_full_from_testrun.txt4351*10 results per select accelerator year -- E:\mcnair\Projects\Accelerator Demo Day\Test Run\demoday_crawl_full.txt1230 These were all copied to Z:\accelerators and cleaned up, and loaded along with the new Results.txt into '''accelerators'''. The SQL is in E:\projects\accelerators\LoadAcceleratorTables.sql It looks like 2340/2514 of our pages are new... ====Other info====
The old process Found the following list of retrieving, classifying, and turking demo day pages is documented on the [[Demo Day Page Parser]], [[Demo Day Page Google Classifier]] and [[Amazon Mechanical Turk for Analyzing Demo Day Classifier'accelerators by accident: https://www.s Results]] pages-b-z. See also the new [[Google Crawler]] page and the old [[Mechanical Turk (Tool)]] pagecom/FORMING%20THE%20BUSINESS/db/accelerators.aspx
===To do===

Navigation menu