Changes

Jump to navigation Jump to search
==Current Work==
 
===Load the existing data===
Dbase is '''accelerators'''
The timing files were processed and their data was assembled. The stack starts with Attended (12896 obs, 7044 with year and 6493 with year and quarter) and sequentially adds timing information until the last table, Attended5 (15460 obs, 10446 with year and 9871 with year and quarter), is produced. With the exception of timing2, each timing file added new cohort cos. Timing1 and timing5 had evidence URLs (total of just 248 distinct).
The next step is ===New Pull=== Made tables:*TheMissing, 129 accs missing total of 4979 cohort cos*ThePresent, 153 accs with total of 10446 cohort cos*ThePresentByYear, 601 acc years*TheReview, 475 acc years -> "TheReview.txt" TheReview.txt was then processed into SearchTerms.txt in E:\projects\accelerators\Google: Accelerator SearchTerm Year After some experimentation, we decided to add the following keywords to figure every search: demo day graduation pitch competition cohort Previously run Google search results are in E:\mcnair\Software\Accelerators\demoday_crawl_full.txt  #Work out the which accelerators/cohort cos we are missing timing info for#Do a new Google Crawl The old process of retrieving, classifying, and turking demo day pagesis documented on the [[Demo Day Page Parser]], [[Demo Day Page Google Classifier]] and [[Amazon Mechanical Turk for Analyzing Demo Day Classifier's Results]] pages. See also the new [[Google Crawler]] page and the old [[Mechanical Turk (Tool)]] page.
===To do===
Still to do:
#Work out which accelerators/cohort cos we are missing timing info for
#Do a new Google Crawl
#Re-train the classifier
#Run the classifier on the Google results

Navigation menu