Changes

Incubator Seed Data (view source)

Revision as of 17:57, 4 October 2019

558 bytes added , 17:57, 4 October 2019

The CIA data is then combined with [[US Incubators]] data, which is separately available in '''USIncubators.txt''', and everything is matched using name based matching to try to remove duplicates (within states) and produce the best information. The result can then be matched back to Crunchbase. There were 2155 distinct orgnames, 37 of which had internal name matches.

perl Matcher.pl -mode=2 -file1="DistinctIncubatorOrgNames.txt" -file2="DistinctIncubatorOrgNames.txt"

The result is the table '''Incubators''' and text file '''Incubators.txt''' with 2137 records and the following coverage:

*orgnamestd --2137

*orgname --2137

*statecode --2137

*url --2031

*description --1447

*city --1955

*address --970

*zip --624

*source --2137

Note that it was surprising that there wasn't greater overlap within the Crunchbase-INBIA-AngelList (CIA) data, or between the CIA data and US Incubators. This suggests that either each source is capturing different types of incubators, or that we are unlikely to have near-population coverage.

Ed

Bureaucrats, Interface administrators, Administrators (Semantic MediaWiki), Administrators

7,612

edits

Changes

Incubator Seed Data (view source)

Revision as of 17:57, 4 October 2019

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools