Hubs Scorecard (Academic Paper)

Jump to navigation Jump to search
Academic Paper
Title Hubs Scorecard (Academic Paper)
Author Ed Egan, Yael Hochberg
RAs Todd Rachowin, Ariel Sun
Status Tabled
©, 2016


As of Spring 2016, a list of potential Hubs with a set of characteristics was created. Many of these are not what will be defined as Hubs. We will be creating a scorecard to help subjectively define Hubs based on certain characteristics.

Work in Progress

Our goal is to automate the collection of variables for potential hubs as much as possible. The key steps for the project are:

  1. Creating a comprehensive list of potential hubs (Complete)
  2. Determining the best variables for the scorecard (Complete)
  3. Building "filters" for automating the collection (Complete)
  4. Running and auditing of the automation (In Progress)
  5. Collecting the remaining manual data (next step)
  • For the detailed current work in progress for building the Hubs datasheet for the scorecard go to: Hubs: Hubs Scorecard
  • For a tracker of work in progress for the dataset building for the scorecard go to Hubs: Hubs Data Building
  • For a high-level overview of the variables for the scorecard go to Hubs: Hubs Data
  • For more information on Mechanical Turks in general, see Mechanical Turk (Tool).
  • Comprehensive list of potential hubs can be found at:
    1. E:\McNair\Projects\Hubs\Raw Program List Contains 600 entities - vast majority are firmly not hubs (file pedigree unknown)
    2. E:\McNair\Projects\Hubs\Hubs Data - Contains 125 entities - many are not hubs (overlap with above file unknown, this file's pedigree from old Hubs project).

Hubs Data

(7/27 Onwards)

Collected variables for 30 hubs that are surely hubs. The results are here:

  • E:\McNair\Projects\Hubs\Hubs Variables -Ariel.xls

(Until 7/27)

See Hubs: Hubs Scorecard

(Week of 7/11)

1) We published the twitter count on mechanical turk and received results.

2) We have audited the results and updated the amazon.

3) We are creating additional potential turks on the amazon site (See Hubs: Hubs Scorecard)

4) We are finding more potential hubs from members of international national business innovation association

(Week of 7/4)

1) We have created the list and commented our thoughts after ---. For determining the variables, we have separated the list into two parts: a list of desired variables and ones that were previously collected, many of which are desired variables.

2) We have also created an example of how to write mechanical turks for collecting certain variables

Variables to be Used

Old variable list (see Hubs Data.xls) contains 18+3 variables. Overlap with new variable list is ~50%

Current Complete List

As of Week of 7/11

  1. Onsite Venture Capital
    • Assets Under Management
    • Number
  2. Onsite Angel Investors
  3. Onsite Mentors
  4. Founding Date
  5. Site URL
  6. Office hours investors
  7. Office hours mentor/advisors
  8. Onsite temporary workshops
  9. Networking Meetups
  10. Sponsors/Partners
    • University
    • Corporate
  11. Curriculum
  12. Onsite code school
  13. Alumni Network
  14. Nonprofit status
  15. Mission statement
  16. Specific Industry
  17. Price for a space
  18. Price for office
  19. Twitter activity
  20. Size (sqft)
  21. Size (# companies)
  22. Onsite accelerator
  23. Community membership??
  24. Franchise
  25. Multiple locations within city

Grouping of Variables

There are a few categories the majority of the variables fall under

Group 1: Low Hanging Fruit Variables in this group are very easy to find and automate.

  1. Twitter Activity
  2. URL
  3. Address
  4. Mission Statement
  5. Specific Industry
  6. Nonprofit
  7. Sponsors/Partners
  8. Price for a space + office
  9. Founding Date

Group 2: The Difficult to Find There are certain variables where the information is not readily available online or difficult to find.

  1. Size (can try to find press releases)

Group 3: In Between 1 and 2 Variables that aren't too easy or difficult to find and automate.

  1. Onsite accelerator
  2. Alumni mentor

Group 4: The Hard to Differentiate The key property of this group is that there are several similar variables, which would be difficult for a turk to differentiate. In order to fix this, we will need to create filters akin to the DSM5 scorecard. See the below section.

  1. Onsite VC v. Angel Investors
  2. Onsite OH Investors v. mentors
  3. Onsite temporary workshops v. networking events
  4. Curriculum v. code school
General Approach Group 4

The Scorecard will be broken down into three main parts: description, characteristics, andTBD parts. The procedure for creating these will be as follows: the description will be determined, develop the characteristics after looking over examples, the creation of possible mechanical turks that have complete accuracy even if not comprehension (e.g. a task will that always guarantees that there is an onsite mentor that covers only 40% of firms, but never misspecifies the existence of mentors), and auditing of the results.

Group 5: The Need further Discussion Before Collection Variables that need to be developed more prior to collection.

  1. Franchise and multiple locations within a city
  2. Community Membership