Changes

Jump to navigation Jump to search
2,210 bytes added ,  13:41, 21 September 2020
no edit summary
{{Project|Has project output=Data|Has sponsor=McNair ProjectsCenter
|Has title=Crunchbase Data
|Has owner=Adrian Smart, Grace Tan, Maxine Tao, Connor Rothschild,
|Has start date=June 2017
|Has keywords=Data, Tool, Crunchbase, VC, Angel
|Has project status=Complete
|Is dependent on=Crunchbase Accelerator Equity, Crunchbase Accelerator Founders,
}}
==New Work==
 
File Location:
E://McNair/Projects/Accelerators/Summer 2018/Accelerators and UUIDs.xlsx
 
This file is an update of Crunchbase data to correspond with our updated list of accelerators. That updated list of 166 accelerators can be found at:
E://McNair/Projects/Accelerators/Summer 2018/Connor Accelerator Work/Accelerator Master Variable List - Revised by Ed V2.xlsx
 
We used SQL to match a huge list of companies downloaded from Crunchbase to our list of accelerators.
==New Work==
===Downloading Data===
#The download script is written in perl. It downloads from version 3.1 of the Crunchbase API. It is called downloadScript downloadScriptv3.1 and located in E:\McNair\Software\Database Scripts\CrunchbaseCrunchbase2. You can execute it by typing "perl downloadScript.pl" in terminal.
Use this key:
662e263576fe3e4ea5991edfbcfb9883
 
A full list of tables in the database called crunchbase2:
organizations
organization_descriptions
people
people_descriptions
degrees
funding_rounds
investments
investment_partners
investors
funds
acquisitions
ipos
events
event_appearances
jobs
category_groups
org_parents
===Commands===
E:\McNair\Software\DatabaseScripts\Crunchbase2
 
Database is called crunchbase2.
In making a table of accelerators and their UUIDs, some match multiple times if you match only on accelerator name. The file(Accelerator Multiple Matches) of those that match multiple times is in:
For the analysis script that obtains the results described above, see 'Analysis.sql' in:
E:\McNair\Software\DatabaseScriptsDatabase Scripts\Crunchbase2 ===Collecting Company Information===Code to build tables is here: E:\McNair\Software\Database Scripts\Crunchbase2\CompanyMatchScript Text file versions of the tables are located in the Z drive. The database is called: crunchbase2
Crunchbase may have blanks and random quotation marks as entries. I had to clean this on textpad.
Go to<br>All crunchbase companies and their UUIDs are here: WHEREE:\McNair\Projects\Accelerators\Summer 2018\CB Company UUIDs
Download*What*What elseA list of UUIDs for the companies we are interested in can be found in the same place as above. It is called 'Our Company UUID matches'. SQL code to get matches is in the CompanyMatchScript mentioned above. The perl matches were done using Matcher.pl. The single match sheet for SQl contains companies that do not have multiple entries in crunchbase. This is also the case for the Perl single match sheet, and it has also been filtered to remove those that were flagged as multiple matches by Matcher.pl
A list for companies, UUIDs, and all other informations is also in the Summer 2018 folder. It is called 'FINAL LIST'. THIS SHEET HAS DUPLICATES THAT ARE IN THE SHEET 'DUPLICATE COMPANIES'.
<br> The definite, final list is:
E:\McNair\Projects\Accelerators\Summer 2018\The File to Rule Them All.xlsx
==Old Notes from previous work==

Navigation menu