Start-Ups of Houston (Map)

Jump to navigation Jump to search

Start-Ups of Houston (Map)
Project logo 02.png
Project Information
Has title Start-Ups of Houston (Map)
Has owner Ben Baldazo
Has start date Summer 2016
Has deadline date
Has keywords Hubs, Houston, Venture, Capital, Startups, Crowding-out, Data
Has project status Subsume
Subsumed by: Houston Entrepreneurship
Has sponsor McNair Center
Has project output Content
Copyright © 2019 All Rights Reserved.

Under Houston Entrepreneurship umbrella.

Linked in Houston Accelerators and Incubators (Report)


Using lists mined from websites, weblists, and databases, this map will be precisely locating and diagraming the Startups of Houston, TX. Later incorporations will include corresponding wiki pages for individual companies as well as maps of startup resources (including: accelerators, incubators, Angels and VC firms).


From File File:HStartupMaster7.xlsx

1454 rows (Company Names), 11 data columns (including name) Column Names: normname, industry, website, descr, street, city, zip, totraised, founddate, accelerator, accelerator2

11 accelerators, 6 companies with 2


Steps taken

  1. Mined websites like AngelList, Cruchbase, StartupBlink, Houston Startups List, etc.
  2. Cleaned data
    1. Columns align with headers all the way down
    2. Websites actually belong to the company (Not youtube or angellist)
    3. There are no "new lines" in individual cells
    4. There are no open quotes (or really just no quotes in general is best)
  3. Uploaded into the Houston psql database
    1. Saved as UTF-8 encoding
  4. Used Matcher to Match compiled names against itself
    1. Used this matched file to standardize/normalize names for future data consolidating
  5. Made distinct list of Houston Startups using file above
  6. Made a priority list for importing data into the Masterfile
  7. Using priority list populated empty columns in Masterfile with each of the mined tables
    1. had to go back separate some things out like addresses or multiple accelerators
  8. exported MasterFile into excel

Future Steps

  1. Use who is parser to fill gaps (addresses and founding dates)
  2. upload individual startups into their own wiki
  3. repeat these steps with Venture firms, Angels (& Groups), Accelerators, Incubators, Service Firms, Flex & Co-working spaces, Event Spaces, etc.


SDC Platinum

Ed Egan