Changes

Jump to navigation Jump to search
1,501 bytes removed ,  17:35, 2 September 2016
no edit summary
This =Hubs Pages=*The main page represents the work used for mechanical turks for the paperHubs can be found: [[Hubs (Academic Paper)]]. *For the current work in progress for building the Hubs datasheet for the scorecard As of Spring 2016, go to: [[Hubs: Hubs Scorecard]]*For a list tracker of potential work in progress for the dataset building for the scorecard go to [[Hubs: Hubs with Data Building]]*For a set high-level overview of characteristics was created. Many of these are not what will be defined as Hubs. We will be creating a the variables for the scorecard go to help subjectively define [[Hubs: Hubs based on certain characteristics. Data]]
=List of Variables=For a more information on Mechanical Turks in general, -depth of the variables and procedure please see : [[Mechanical Turk (Tool)Hubs: Hubs Scorecard]]. This page will reflect the variables being collected separated into three categories. Each variable will include a breakdown of levels being collected if the definition is not trivial and an approximate approach.
The main goal of the mechanical turk is to automate the collection of variables for potential hubs as much as possible. The key steps for the project are:
#Creating a '''comprehensive''' list of potential hubs
#Determining the best variables for the scorecard
#Building '''"filters"''' for automating the collection
#'''Running''' and '''auditing''' of the automation
#Collecting the remaining manual data
=Variables to be Used=
==Current Complete List==
'''As of Week of 7/11'''
#Onsite Venture Capital
#*Assets Under Management
#*Number
#Onsite Angel Investors
#Onsite Mentors
#Founding Date
#Site URL
#Office hours investors
#Office hours mentor/advisors
#Onsite temporary workshops
#Onsite mentors
#Networking Meetups
#Sponsors
#*University
#*Corporate
#Curriculum
#Onsite code school
#Alumni Network
#Nonprofit status
#Mission statement
#Specific Industry
#Price for a space
#Price for office
#Twitter activity
#Size (sqft)
#Size (# companies)
#Onsite accelerator
#Community membership??
#Franchise
#Multiple locations within city
'''07/29''' Ariel: code Hubs variable for Hubs
:<code>E:/McNair/Projects/Hubs/Hubs Variable-Ariel</code>
=Variables for Hubs=
We will be creating a "Hubs scorecard" to determine how hub-like potential spaces are. In order to do so, we will evaluate the places based on certain variables. Previous variables for potential hubs were collected. Below, we list those as well as other variables we think might be helpful to build out the scorecard.
Ideally, we would have the following variables (not collected previously):'''As of Week of 7/25'''===Group 1==='''Variables Difficult to Obtain'''#Onsite VC/Angel/Investors '''Founding Date''' ''(Count or binarydate_founded)''##Comments*''' ''Difficulty:'' ''' Finding date based on our strategies##Mechanical Turk Comments*''' ''New Approach:#Onsite Mentors (binary) --- ''Are these the same as advisers?'''#*#Comments:Whois.net Date#*#Mechanical Turk Comments:Factavia/other press release searches #"Office hours" with investors or mentors '''Multiple locations within city + Franchise''' (binaryas of now just addresses)##Comments: Previously collected included number of events, but did not separate them into categories ''(e.g. networking events, workshops, etc.multi_address). We view this separation as important, BUT very difficult to collect''##Mechanical Turk Comments*''' ''Difficulty: '' ''' Company or establishment level will impact measurements#Onsite temporary workshops (binary or count) *** '''see mechanical turk''New Approach:'' '''Will record all addresses at company level##Comments:##Mechanical Turk Comments:#Networking Meetups (Binary or count) *** '''see mechanical turkOnsite Venture Capital v. Angel Investors'''##Comments:##Mechanical Turk Comments:(e.g. #Sponsors and Partners Assets Under Management) ''(onsite_Vc_bin)/(binary and listonsite_vc_list) --- a''re these the same?''(onsite_angel_bin)/etc.''##Comments*''' ''Levels:'' ''' Binary, list of investors##Mechanical Turk Comments*''' ''Difficulty:#Alumni Network (binary) --- ''not all potential hubslist this and the fact that some do might indicate its importance''' Hub website usually does not include investors##Comments:##Mechanical Turk Comments:#Num of Companies --- *'''to help determine size as getting physical sqfootage is difficult''##CommentsNew Approach:##Mechanical Turk Comments:#Nonprofit (binary) --- ''helpful in determining goals of potential hubs'''#*#Comments:Google key terms with address of Hub#*#Mechanical Turk Comments:#Mission Includes Key Buzzwords (e.g. "ecosystem", "community") --- ''help separate simple coworking spaces form hubs''Start with partners and use google/crunchbase
Example ===Group 2==='''Variables Comfortable, Not Complete''' (rough order of Prior Variables Collectedmost difficult to least difficult)#'''Onsite accelerator''' ''(onsite_accel_bin)/(onsite_accel_cnt)/(onsite_accel_list)''#*''' ''Levels:'' ''' Binary, count, list#*''' ''Difficulty:'' ''' Usually not a list, which requires more scrubbing as many other variables just require us to find one page on a website. #*''' ''Approach:'' '''#*#Google searches and procedure to use on website yields decent results#*#Similar procedure to onsite investors#'''Size (# members)''' ''(num_members)''#*''' ''Levels:'' ''' Count for companies (currently not planning to include list of companies given that some potential hubs have 200+ members)#*Specific Industry ''' ''Difficulty:'' ''' Some companies don’t list all members -only selective ones- , others do not separate current members and alumni, and some just write "we have served more than 120 startups..."#*''defined as LinkedIN Self Identifier' ''Approach:'' ''' For companies that have a list, no categories just plain textwe will count. We think what For those with select members, we really want is will count those they listed and try to see if there is a comment about how many they have. For those that just have a specialty statement "with over," we will write the number and + (e.g. healthcare"120+).#'''Office hours investors''' and '''Office hours mentor/advisors''' ''(OH_bin)/(OH_inv_bin)/(OH_inv_list)/etc.''#*Num ''' ''Levels:'' ''' Binary for OH, binary for two separate OH, list of names/descriptions of Events --- OH#*''' ''Difficulty:'' '''relatively complete inputsSome companies do not list who OH are with, but from March 2016 not always obvious if investor, mentor, or advisor, sometimes not clear if mentor is investor/future investor#*''' ''Approach:'' ''' Google approach to get to OH pages and then lookup key words in description to separate out#'''Onsite temporary workshops and Networking Meetups''' (see above as wellCount) ''(onsite_temp_events_bin)/(onsite_temp_workshop_bin)/(onsite_temp_workshop_cnt)/etc.''#*Price ''' ''Levels:'' ''' Binary for do they exist, count for each#*''' ''Difficulty:'' ''' Difficult for Single Space --- Turkers to differentiate between these two and also other potential events (e.g. symposiums)#*''' ''Approach:'' '''defined as price Uses key search terms (e.g. Java/etc.) to separate out workshops and key terms (e.g. lunch/happy hour) for flexible desk, relatively complete inputsnetworking meetings#'''Onsite code school''' and '''Curriculum''' ''(onsite_long_term_courses)/(onsite_code_school_bin)''#*''' ''Levels:'' ''' Binary for do they exist, binary for each#*Price ''' ''Difficulty:'' ''' Difficult for Office --Turkers to differentiate between long- term coding programs for individuals and curriculum for startups#*''' ''Approach:'''no inputs''Uses key search terms (e.g. specific code schools) to separate out known code schools and also to look into key terms (e.g. leadership) for curriculum*Twitter Activity #'''Sponsors/Partners''' (Multinomial or CountUniversity, Corporate) --- ''High=2(sponsors_cnt)/(sponsors_list)/Moderate=1etc.''#*''' ''Levels:'' ''' Count, list of sponsors/No=0partners (if exist), separate columns for university and corporate#*''' ''Difficulty:'' ''' Not all companies will list sponsors, partnesrs, no explanations on how to categorize or either. Not always clear the activitydifference among sponsors, partners, investors. Also no handles#*''' ''Approach:'' ''' Use two different levels and use of google search, then if list exists, separate by "college"/"university" and rest#'''Alumni Network''' ''(alumni_bin)/(alumni_list)''#*''' ''Levels:'' ''' Binary, list#*''' ''Difficulty:'' ''' Not all companies list alumni, some only list "selected"#*''' ''Approach:'' ''' Include all that have lists#'''Size (sqft) --- ''no records for majority of the companies'''(size_sqft)''#*''' ''Levels:'' ''' Number in sqft#*Num Conference Rooms --- ''no records for majority of the ' ''Difficulty:'' ''' Not all companieslist square feet online#*''' ''Approach:'' '''#*#Google search with key words#*#If results do not appear, use of press releases is possible#'''Onsite accelerator Mentors''' ''(onsite_mentors_bin)/(onsite_mentors_cnt)/(binaryonsite_mentors_list) --- ''relatively complete inputs#*''' ''Levels:'' '''Count and list of mentors (if exist)#*Onsite code school (binary) --- ''relatively complete inputs'''Difficulty:'' ''' Not all companies list mentors - bigger issue is onsite investors#*Community Membership (binary) --- ''relatively complete inputs'''Approach:'' ''' Use two different levels and use of google search
=Test2==Group 3===*'''Twitter activityVariables Easy to Obtain'''#''': Twitter activity'' '''UPDATE (7twit_handle)/(twit_prev_mon_cnt_tweets)/(twit_cnt_followers)/14(twit_cnt_retweets)''#*': Updated turk to reflect our desired formats'''UPDATE (7/12)'Levels:'': '''AUDIT RESULTSTwitter Handle, # Tweets in a Month, # Followers, # Retweets#*''': We noticed  ''Approach:'UPDATE (7/11)''': uploaded and published on amazon's mechanical turk site. Given the time cost Easy to either record number of tweets in a month get twitter handle from Turk or look up more than 10 tweets, we decided Veeral's code that allows us to record the date of the last 10th tweet. Using run a sample series of ~10 companies, We noticed minimal differences in data observations among using 10,20, searches on google and 30 tweets.then use Gunny''#Copy the text in the Search Text into a search engine.#Click on result s Twitter crawler to get other levels from twitter.com with the company name. If the link does not appear on the first 3 pages, record DNE for both outputshandle#Record the company's Twitter Handle into Twitter Handle#Record the date ''Site URL''' ''(MM/DD/YYurl) of that tweet for Twitter Activity. If there are less than 10 tweets, record DNE. ''#*'''NUMBER OF EVENTS'''Levels: ''UPDATE: written, not published, on amazon's mechanical turk site''URL#*''' ''Approach:'''Considerations''Google using Veeral'*Difficulties Encountered:*Expected Time s code that allows us to Complete:search *Expectation of Results #''' ''Whois Date'' ''' ''(accuracy of turk, comprehensivenessdate_whois):''#*Other Comments''' ''Levels:'' ''' Date #*''' ''Approach:''Procedure'''#Copy the text in the Search Text into a search engine.#Click on the result that is the Date active website of the company. If there does not exist a listing on the first three pages, mark as DNE.was registered#Look for links related to events, such as 'Events' or 'CalendarAddress''' on the homepage. #If not found on the homepage, check 'About' and check (address)'Community'#Count the number of events in July 2016 and record it. If there is no information of events on the website, record DNE. Note***''' ''Levels: ''Events ''' Will include meetups, workshops, info sessions etc. We do not want to count them separately since it is difficult to do so. Most companies put all the events on the same section and do not put event types in the titles of the events. We have to look into the details of the events to find out the type and even we do so some events descriptions do not allow us to determine the type easily. Differentiating the types of the events demands more time and effort and therefore is not suitable to be a mechanical turk project.'' addresses#*'''Onsite Mentors'''Approach: ''UPDATE: written, not published, on amazon's mechanical turk site''#Copy the text in the Search Text into a search engineGoogle key terms (e.#Click on the result that is the website of the company. If there does not exist a listing on the first three pages, mark as DNEg.Contact Us) and URL using Veeral's code#Look for links related to mentorship such as 'mentors', 'mentorshipNonprofit status''' '' or (nonprofit_binary)'mentoring programs'#If *''' ''Levels:'' ''' Binary variable indicating if the key words can be identified, mark as 1potential Hub is a nonprofit organization#If there is no explicit *''' 'mentoring' section, look for links related to a description of the company, such asApproach: 'About,' 'Our Team,' 'Our Mission,' etchttp://www.guidestar., look for org/ is a site that we can use to search if a subsection company is nonprofit or mention of mentor/mentorship/mentoringnot#If these exist, mark as 1.#If not, go to links related to membership 'benefits,' 'perks,Mission statement''' or related.#Do same process as end of 4 and 5#If there is no mention of mentorship in these sections, type the company, city, and 'mentoring' into a search engine. If a link to a reliable website (such as Desktimemissions_stmt) appears and mentorship can be found in the description, mark as 1.''#If none of these steps result in a mark of 1, mark as 0   *'''Nonprofit'''Levels: ''UPDATE: written, not published, on amazon's mechanical turk site''#Copy the text in the Search Text into a search engine.#Click on the result that is the website Official mission statement or description of the company. If there (if mission does not exist a listing on the first three pages, mark as DNE.)#Go to links that describe the company, usually they are labelled*''' ''Approach: 'About'''' If not explicitly stated mission statement, will include "About" or statements on main page#'Our Story,' 'MissionSpecific Industry'#Look for the key word 'nonprofit'/'non-profit'#If (spec_industry)'nonprofit' is identified, mark as 1, otherwise 0.   #*'''Number of Members'''Levels: ''UPDATE: written, not published, on amazon's mechanical turk site''#Copy the text Industry included in the Search Text into a search engine.statement (no aggregation)#Click on the result that is the website of the company. If there does not exist a listing on the first three pages, mark as DNE.#Look for the link *'''Members' or 'ResidentsApproach:', usually they are under the links 'Community', 'Membership'*Based on Mission Statement, not aggregated#'Our Space' or 'The SpacePrice for a space/office'.#Count the number of members#If the link or section of 'Members' is not found, go the 'Community' and (price_space)'Coworking' and look for the description on number of startups/founders/members in the community. Record the number.#If number of members cannot be identified using above steps, record DNE.  *'''Sponsors and Partners'''Levels:''UPDATE: written, not published, on amazon's mechanical turk site''#Copy the text in the Search Text into a search engine.#Click on the result that is the website of the company. If there does not exist a listing on the first three pagesTwo prices one for shared, mark as DNE.other for private#Look for the link or mention of *''Sponsors' or 'Partners', many times of which is often under the section of Approach:''About', 'Community', or related sectionsUses google methodology with key terms and URL[[Category: Internal]]#If sponsors or partners can be found mark as 1 and list them, otherwise mark as 0.[[Internal Classification: Legacy| ]]

Navigation menu