Changes

Jump to navigation Jump to search
no edit summary
The purpose of this project is to build a classifier, which takes the description of an ecosystem organization (i.e., a startup, a venture capitalist, an incubator, etc.) and either correctly classifies the organization's type or correctly classifies incubators vs. non-incubators.
 
==Related Projects==
 
Subsumed Projects:
{{#show: Ecosystem Organization Classifier|?Does subsume}}
 
This project is dependent on:
{{#show: Ecosystem Organization Classifier|?Is dependent on}}
===Text Processing===
We can use [[The Matcher (Tool)]] to match organization names to portfolio companies and VC funds and firms taken from '''vcdb3''' (see [[VentureXpert Database]]). We will also search this data by hand for incubators to get an initial set. Later on, we'll match our early list of incubators to crunchbase organization names to expand our list.
==Related ProjectsIncubator Scores of Crunchbase Results== {| class="wikitable sortable" style="width:100%"|-! style="width: 2%" | # ! style="width: 9%" | Company! style="width: 2%" | Self Described [Y/N]! style="width: 9%" | State! style="width: 9%" | City! style="width: 7%" | Region! style="width: 7%" | Lists Client Companies [Y/N]! style="width: 9%" | Fixed Duration [Y - 0 /N - 1]! style="width: 5%" | Incubator Investment [Y - 0 /N - 1]! style="width: 9%" | Cohorts [Y - 0 /N - 1] ! style="width: 9%" | Formal Application Process [Y - 0 /N - 1]! style="width: 9%" | Incubator Score out of 4 ! style="width: 9%" | Notes (Foreign, Virtual, Social Impact, or other observations)|-||||||||||||-|}
Subsumed Projects:{{#show: Ecosystem Organization Classifier|?Does subsume}}==Process Notes for Calculating Incubator Scores==
This project is dependent onTwo new files were generated from the '''crunchbase3''' dbase as follows:{{#show: Ecosystem Organization Classifier|?Is dependent on}}
\COPY (SELECT uuid, company_name, short_description FROM Organizations WHERE country_code='USA' AND short_description LIKE '%incubat%') TO 'CrunchbaseShortOrgDescsUSAIncubat.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV --466 \COPY (SELECT A.uuid, A.company_name, B.description FROM Organizations AS A JOIN organization_descriptions AS B on A.uuid=Incubator Scores of Crunchbase Results=B.uuid WHERE country_code='USA' AND description LIKE '%incubat%') TO 'CrunchbaseLongOrgDescsUSAIncubat.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV --933
''Process Notes''These files were put in E:\projects\crunchbase3.
1. New file - Renamed E:\projects\crunchbase3\organizations as E:\projects\crunchbase3\organizations_OnlyIncubators_PlusIncubatorScores
2. Only US - CTRL+Fed for "US" , created a column filter for only USA companies, and deleted non-US based organizations
3. Incubator - CTRL+Fed for "incubator" and deleted organizations that didn't identify as an incubator
4. New Columns - 1. Number#;  2. Company with URL to page linked;  3. Self-Identified Incubator[Y/N];  4. State;  5. City;  6. Region;  7. Lists Client Companies [Y/N] with URL linked; 8. Fixed Duration [Y - 0 /N - 1]: (Startups at an incubator generally do not all stay for the same fixed duration; Incubator does not have a fixed graduation date for its startups or has a program that lasts longer than one year); 9. Incubator Investment [Y - 0 /N - 1]: (Incubator does not invest directly in the company or take equity in its startups); 10. Cohorts [Y - 0 /N - 1]: Incubator does not have limited-duration programs that ventures enter and exit in groups, known as cohorts or batches.; 11. Formal Application Process [Y - 0 /N - 1]: (Selective, competitive admissions process; Fixed, not rolling application process); 12. Incubator Score out of 4 (A score of 4 is most likely to be an incubator and a score of 0 is less likely to be an incubator based on our baseline attributes for an incubator [[Defining Incubators]])
85. Incubator Investment [Y Deleted Columns - 0 /N - 1]: (Incubator does not invest directly in the company or take equity in its startups)funding_rounds; roles; permalink; domain; funding rounds
96. Cohorts [Y Delete Closed Incubators - 0 /N - 1]: Incubator does not have limited-duration programs Filtered 'status' column to exclude showing results that ventures enter and exit in groups, known as cohorts or batches.; are 'closed'
107. Formal Application Process [Y Made A Table - 0 /N - 1]: (Selective, competitive admissions process; Fixed, not rolling application process); Converted entire worksheet into a table to filter more easily
118. Incubator Score out of 4 (A score of 4 is most likely to be an incubator and Identified Self-Identified Incubators - Created a score of 0 is less likely to be an incubator based on our baseline attributes custom-auto filter that searched the 'short description' for an incubator [[Defining Incubators]])'contains: incubat'

Navigation menu