Changes

Jump to navigation Jump to search
621 bytes added ,  19:37, 29 December 2023
no edit summary
[[VCDB23]] is the 2023 iteration of my venture capital database. The last previous build was [[VCBD20VCDB20]], and this follows the same basic design. This build was partial - not all location data was included. It was superseded by [[VCDB24]].
== Processing Steps ==
# Run the ssh files against SDC Platinum. Note that SDC Platinum's service will be withdrawn on 31 December 2023.
# Run the [[SDC Normalizer]] script (one of the pl files) on each output
## There are special steps for Fix the header row in USFirms1980, .txt before normalizing (the Capital Under Management column name is too long)## Remove double quotes from USFund1980-normal.txt, USFundExecs1980-normal.txt, USPortCo1980-normal.txt, USFirmBranchOffices1980.txt## The private and public M&A file sets have to be separately combined into 2 files after they've been normalized. Then replace \tnp\t and \tnm\t with \t\t in each.## For RoundOnOneLine (which needs multistep processing), remove the footer, run NormalizeFixedWidth.pl first, then RoundOnOneLine.pl, and then fix the header.## PortCo_Long_Description requires bespoke processingPortCoLongDescription must be pre-processed from the command line and then post-processed in excel (see VCDB20H1 and Vcdb4#Long_Description). However, I didn't load it for this run.
# Create a new database on mother (createdb vcdb23) and setup a directory for the input files: E:\projects\vcdb23
# Copy over and edit Load.sql. Run it section-by-section.

Navigation menu