Changes

Jump to navigation Jump to search
no edit summary
Since the data will be changing a lot compared to previous years, using \i load_crunchbase.sql might not very useful, and one may need to copy one table at a time by pasting the sql script into the terminal.
 
'''03/29/2019 update'''
All the dataset (17 of them) from the API has been copied to the PostgreSQL server in drive Z under /bulk/crunchbase3. To make date-time format in postgres works properly, all the empty string with quotes ("") in CSV files have been replaced by NULL with the command line
The script that I used to do that is in the file clean_data.sh in E:/projects/crunchbase3. A shorter script to do that for all the files in the directory is possible but might not be necessary and not all files require such edit.
==Working with the database== All the scripts in load_crunchbase.sql have been updated. It now works perfectly with the current data (as of 03/29/2019) crawled from crunchbaseAPI and includes the correct number of rows copied from the csv files at the end of each \COPY command.
To see and use the data in the postgres server:
82

edits

Navigation menu