Changes

Jump to navigation Jump to search
==Processes==
 
'''Steps taken'''
#Mined websites like AngelList, Cruchbase, StartupBlink, Houston Startups List, etc.
#Cleaned data
**Columns align with headers all the way down
**Websites actually belong to the company (Not youtube or angellist)
**There are no "new lines" in individual cells
**There are no open quotes (or really just no quotes in general is best)
#Uploaded into the Houston psql database
**Saved as UTF-8 encoding
#Used Matcher to Match compiled names against itself
**Used this matched file to standardize/normalize names for future data consolidating
#Made distinct list of Houston Startups using file above
#Made a priority list for importing data into the Masterfile
#Using priority list populated empty columns in Masterfile with each of the mined tables
**had to go back separate some things out like addresses or multiple accelerators
#exported MasterFile into excel
 
'''Future Steps'''
#Use who is parser to find missing addresses
#upload individual startups into their own wiki
#repeat these steps with Venture firms, Angels (& Groups), Accelerators, Incubators, Service Firms, Flex & Co-working spaces, Event Spaces, etc.
==References==

Navigation menu