Grace Tan (Work Log)

From edegan.com
Revision as of 17:22, 22 June 2018 by GraceTan (talk | contribs)
Jump to navigation Jump to search


McNair Center Staff
{{{name}}}
Profile placeholder.png
Staff Information
Status Active
McNairCenterⓂ




Grace Tan Work Logs (log page)

2018-06-22: Matched Connor's master list of accelerators with organizations.csv based on homepage_url and company_name. Found 90 that matched along with 76 blanks. Then tried matching with homepage_url or company_name and manually found about 30 more that had slight variations in url or name that we should keep. Using ILIKE we found ~25 more company UUIDs that match with accelerators on the list.

2018-06-21: Downloaded all 17 v3.1 csv tables and updated LoadTables.sql to match our data. We did this by manually updating the name and size of the fields. To solve the problem of "" from yesterday, we used regular expressions to change the empty string to nothing (see project page). We then worked with Connor to start extracting the accelerators from the organizations in the Crunchbase data. We found a lot of null matches based on company_name and a few that have the same name but are actually different companies. Maybe try matching with homepage_url tomorrow.

2018-06-20: Learned more SQL. Started working on Crunchbase Data project with Maxine. Old code contained 22 csv tables but new Crunchbase data only has 17 csv tables. We will be using the new Crunchbase API v3.1 ( not v3) with only 17 csv tables as data. We then started updating the old SQL tables to align with the 17 tables we have. We ran into a problem where a field of "" in the data for a date type and SQL did not like that. Ed was helping us with this but we have not found a solution yet.

2018-06-19: Set up monitors and continued learning SQL. We were also introduced to our projects. I will be continuing Christy's work on the Google Scholar Crawler as well as working with Maxine to update the Crunchbase data and then use that data to crawl Linkedin to find data on startup founders that go through accelerators.

2018-06-18: Introduced to the wiki, connected to RDP, and learned SQL.