Difference between revisions of "Grace Tan (Work Log)"
Line 8: | Line 8: | ||
[[Grace Tan]] [[Work Logs]] [[Grace Tan (Work Log)|(log page)]] | [[Grace Tan]] [[Work Logs]] [[Grace Tan (Work Log)|(log page)]] | ||
+ | |||
+ | 2018-06-25: We took the 157 Accelerator UUIDs we found and created a new table that includes all the attributes of the accelerator that we want from organizations.csv called AccAllInfo. Maxine and I then split into our respectful projects. I tried joining people to the companies they are linked to in order to find the founders of each accelerator. I found about 90 matches but this there are still a lot of missing holes since some accelerators have no founders and others have multiple founders. Still unsure of how to fix this. | ||
2018-06-22: Matched Connor's master list of accelerators with organizations.csv based on homepage_url and company_name. Found 90 that matched along with 76 blanks. Then tried matching with homepage_url or company_name and manually found about 30 more that had slight variations in url or name that we should keep. Using ILIKE we found ~25 more company UUIDs that match with accelerators on the list. | 2018-06-22: Matched Connor's master list of accelerators with organizations.csv based on homepage_url and company_name. Found 90 that matched along with 76 blanks. Then tried matching with homepage_url or company_name and manually found about 30 more that had slight variations in url or name that we should keep. Using ILIKE we found ~25 more company UUIDs that match with accelerators on the list. |
Revision as of 15:56, 25 June 2018
{{{name}}} | |
Staff Information | |
---|---|
Status | Active |
McNairCenterⓂ |
Grace Tan Work Logs (log page)
2018-06-25: We took the 157 Accelerator UUIDs we found and created a new table that includes all the attributes of the accelerator that we want from organizations.csv called AccAllInfo. Maxine and I then split into our respectful projects. I tried joining people to the companies they are linked to in order to find the founders of each accelerator. I found about 90 matches but this there are still a lot of missing holes since some accelerators have no founders and others have multiple founders. Still unsure of how to fix this.
2018-06-22: Matched Connor's master list of accelerators with organizations.csv based on homepage_url and company_name. Found 90 that matched along with 76 blanks. Then tried matching with homepage_url or company_name and manually found about 30 more that had slight variations in url or name that we should keep. Using ILIKE we found ~25 more company UUIDs that match with accelerators on the list.
2018-06-21: Downloaded all 17 v3.1 csv tables and updated LoadTables.sql to match our data. We did this by manually updating the name and size of the fields. To solve the problem of "" from yesterday, we used regular expressions to change the empty string to nothing (see project page). We then worked with Connor to start extracting the accelerators from the organizations in the Crunchbase data. We found a lot of null matches based on company_name and a few that have the same name but are actually different companies. Maybe try matching with homepage_url tomorrow.
2018-06-20: Learned more SQL. Started working on Crunchbase Data project with Maxine. Old code contained 22 csv tables but new Crunchbase data only has 17 csv tables. We will be using the new Crunchbase API v3.1 ( not v3) with only 17 csv tables as data. We then started updating the old SQL tables to align with the 17 tables we have. We ran into a problem where a field of "" in the data for a date type and SQL did not like that. Ed was helping us with this but we have not found a solution yet.
2018-06-19: Set up monitors and continued learning SQL. We were also introduced to our projects. I will be continuing Christy's work on the Google Scholar Crawler as well as working with Maxine to update the Crunchbase data and then use that data to crawl Linkedin to find data on startup founders that go through accelerators.
2018-06-18: Introduced to the wiki, connected to RDP, and learned SQL.