Changes

Jump to navigation Jump to search
=To-do list=
#1. Filter out actual accelerators from the Crunchbase organizations data**Possibly by running accelerator_keywords.py**Possibly by using string searching in organizations.csv**Watch out for Venture capital companies (the organizations file has many of these and we'll probably pick up a lot in our "accelerator" filtered list#2. Match this list against the current list of accelerators**We have our own copy of the matcher in the accelerators E drive (try mode 1 and mode 2 for different results, mode 2 might be more helpful)**This will tell you whether it was part of the old list or not (and therefore whether we need to get data for it or not)#3. Find cohort data for all of the new accelerators (ones not previously on the list & if they're not accelerators take them off the list)**We used regex for this**once you find the cohort data put it into the updated cohort data list excel file#4. Match the cohort data against the round data from SDC**Make sure to get both the accelerator name and the cohort company name in the first document**In the second document (to match against the first) put the list of all companies funded in rounds (from SDC)**in summary: File1 = Accelerator Cohorts and File2 = SDC data#5. Upload the match file into the psql database, then follow the code in accelerators.sql**making new code with your new uploaded tables and documents, you should just be able to follow what we've already done to get a similar percentVC table**The previous percent VC table you'll want it to look like is PercentVc4
=Don't worry about this stuff=

Navigation menu