Retrieving US VC Data From SDC

From edegan.com
Revision as of 18:23, 19 June 2017 by Ed (talk | contribs) (Created page with "This page descripts the specification for retrieving US VC Data From SDC ==Scripts== SDC and Perl Scripts are in: E:\McNair\Projects\VC Database This includes: *NormalizeF...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This page descripts the specification for retrieving US VC Data From SDC

Scripts

SDC and Perl Scripts are in:

E:\McNair\Projects\VC Database

This includes:

  • NormalizeFixedWidth.pl
  • RoundOnOneLine.pl - Needed for round on one line

SQL Scripts and finished data are in:

Z:\VentureCapitalData\SDCVCData

Notes

Notes:

  • Portfolio companies attributes to be extracted from roundlevel information
  • Round-on-one-line processed using RoundOnOneLine.pl and NormalizeFixedWidth.pl
  • Firms includes branch office so attributes must be extracted
  • Portfolio company descriptions - just the portco name, state, date of first inv, and the long description - has to be custom processed.

Some files required some minor post-processing to load into PostgreSQL. Issues included:

  • Firm level data didn't normalize correctly - had to adjust headers
  • Stray quotation in address line
  • Area code had a 1- in it
  • Some line counts were off by one or two

Build Specs

IPOs

Created: 6/19/2017 2:59:33 PM

Core files:

IPO1980-present.rpt                                                       
IPO1980-present.ssh                                                       
IPO1980-present.txt                                                       
IPO1980-present-normal.txt                                                

Session Contents:

Global New Issues Databases
Issue Date: 1/1/1980 to 06/19/2017 (Custom) (Calendar)
Issuer/Borrower Nation : US
Select All IPOs
Primary Exchange of Issuer's/Borrower's Stock : A, NM, N

Report Contents:

D
I
STC
AMT
RANK1_TOTDOLAMTPRO
NAICS_PRIMARY_CODE
ZIPL
TRANS_STATUS
FOUNDED

LoadIPOs.sql

DROP TABLE ipos;
CREATE TABLE ipos (
  issuedate date, --mm-dd-yy
  issuer varchar(255),
  statecode varchar(10), 
  principalamt money, --million
  proceedsamt money, --sum of all markets in million
  naiccode varchar(255), --primary NAIC code
  zipcode varchar(10),
  status varchar (20),
  foundeddate date
);

\COPY ipos FROM 'IPO1980-present-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS  CSV
--10440

M&As

Created: 6/19/2017 4:16:59 PM

Core files:

MA1980-present.rpt                                                        
MA1980-present.ssh                                                        
MA1980-present.txt                                                        
MA1980-present-normal.txt                                                 

Session Contents:

DATABASES: Domestic Mergers, 1979-Present (MA, OMA)
Date Announced: 1/1/1980 to 06/19/2017 (Custom) (Calendar)
Target Public Status : V
Percent of Shares Owned after Transaction: 100 to 100
Target Nation : US

Custom Report Contents:

TN
DE
TST
AN
ENTVAL
ASTC
TSTC
ATIC
AEXCH
TNAICP
DA
VAL
YEAREFF
TSTR
TCITY
TZIP

LoadMAs.sql

DROP TABLE mas;
CREATE TABLE mas (
  targetname varchar(255),
  effectivedate date,
  targetstate varchar(255),
  acquirorname varchar(255),
  enterpriseval varchar(255), --millions
  acquirorstatecode varchar(10),
  targetstatecode varchar(10),
  acquirorticker varchar(10),
  acquirorexchange varchar(20),
  targetnaiccode varchar(10),
  announceddate date, --mm-dd-yy
  transactionamt money, --$millions
  completedyear varchar(4), --yyyy
  targetaddress varchar(255),
  targetcity varchar(100),
  targetzipcode varchar(10)
);

\COPY mas FROM 'MA1980-present-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS  CSV
--139013

Portfolio Companies & Rounds

Core files:

USVC1980-present-normal.txt                                               
USVC1980-present.rpt                                                      
USVC1980-present.ssh                                                      
USVC1980-present.txt                                                      

Created: 6/15/2017 5:38:16 PM

Session Contents:

DATABASE: Portfolio Companies (VIPC)
Venture Related Deals: Select All Venture Related Deals
Company Nation : US
Round Date: 1/1/1980 to 06/15/2017 (Custom) (Calendar)

Custom Report Contents:

N
RD
VI_UPDATE_STAMP
VI_YEARFOUND
LAST_INV_DATE
FIRST_INV_DATE
TOTIN
VI_CITY
VI_BUSDESC_SHORT
COMPANY_MSA_DESC
COMPANY_MSA_CODE
VI_NATION
VI_STATE
VI_STREET1
VI_STREET2
COMPANY_VEIC1_DESC
COMPANY_VEIC6_DESC
COMPANY_VEIC3_DESC
WEBSITE
VI_ZIP
STAGE1_DESC
STAGE3_DESC
VI_XC_ROUND_AMOUNT
VI_XC_TOT_KNOWN_AMT
VI_FINNO
VI_INVNOS

LoadRoundBase.sql

DROP TABLE roundbase;
CREATE TABLE roundbase (
  coname varchar(255),
  rounddate date,
  updateddate date,
  foundingdate date,
  datelastinv date,
  datefirstinv date,
  investedk real,
  city varchar(100),
  description varchar(5000),
  msa varchar(100),
  msacode varchar(10),
  nationcode varchar(10),
  statecode varchar(10),
  addr1 varchar(100),
  addr2 varchar(100),
  indclass varchar(100),
  indsubgroup3 varchar(100),
  indminor varchar(100),
  url varchar(5000),
  zip varchar(10),
  stage1 varchar(100),
  stage3 varchar(100),
  rndamtdisck real,
  rndamtestk real,
  roundnum integer,
  numinvestors integer
);

\COPY roundbase FROM 'USVC1980-present-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS  CSV
--143348

Round On One Line

Core files:

USVCRound1980-present-NoHead-normal.txt                                   
USVCRound1980-present-NoHead.txt                                          
USVCRound1980-present-normal.txt                                          
USVCRound1980-present-sessiondetails.txt                                  
USVCRound1980-present.rpt                                                 
USVCRound1980-present.ssh                                                 
USVCRound1980-present.txt   

Created: 6/16/2017 4:39:47 PM

Session Contents:

DATABASE: Portfolio Companies (VIPC)
Venture Related Deals: Select All Venture Related Deals
Company Nation : US
Round Date: 1/1/1980 to 06/16/2017 (Custom) (Calendar)

Custom Report Contents:

N
VI_STATE
LAST_INV_DATE
FIRST_INV_DATE
COMBINED_ROUND_INFO

LoadRound.sql

DROP TABLE roundline;
CREATE TABLE roundline (
  coname varchar(255),
  rounddate date,
  amountk real,
  fundname varchar(255)
);

\COPY roundline FROM 'USVCRound1980-present-NoHead-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS  CSV
--385753

VC Funds

Core files:

USVCFunds1980-present.ssh                                                 
VCFund1980-present-normal.txt                                             
VCFund1980-present.txt                                                    
VCFunds1980-present.rpt                                                   

Created: 6/16/2017 3:28:23 PM

Session Contents:

DATABASE: Private Equity Funds (VIFD)
Venture Related Deals: Select All Venture Related Deals
Fund Portfolio Cos: 1/1/1980 to 06/16/2017 (Custom) (Calendar)

Custom Report Contents:

VI_DISTINCT_FUND_NAME
VI_FIRST_CLOSE_DATE
FUND_LAST_INV_DATE
FUND_FIRST_INV_DATE
FUND_NUM_COMPANIES
FUND_TOTIN
VI_DISTINCT_FUND_CITY
FUND_YEAR
VI_DISTINCT_FUND_ZIP
VI_DISTINCT_FUND_STC
VI_FUND_SIZE
VCPI_STAGE_DESC
FIRMS
INVEST_TYPE_DESC
FUND_MSA_CODE
VI_DISTINCT_FUND_NATC
FUND_RAISE_STATUS_DESC
VCPI_SEQ_NUM
VCPI_SEQ_DESC
VI_TOTAL_TARGET
FUND_TYPE_DESC_SHORT
VI_DISTINCT_FUND_ZIP

LoadFunds.sql

DROP TABLE fundbase;
CREATE TABLE fundbase (
  fundname varchar(255),
  closedate date, --mm-dd-yyyy
  lastinvdate date, --mm-dd-yyyy
  firstinvdate date, --mm-dd-yyyy
  numportcos integer,
  investedk real,
  city varchar(100),
  fundyear varchar(4), --yyyy
  zip varchar(10),
  statecode varchar(2),
  fundsizem real,
  fundstage varchar(100),
  firmname varchar(255),
  lastupdated varchar(255),
  fundnationcode varchar(10),
  invtype varchar(100),
  msacode varchar(100),
  nationcode varchar(10),
  raisestatus varchar(100),
  seqnum integer,
  targetsizem real,
  seqtype varchar(100)
);

\COPY fundbase FROM 'VCFund1980-present-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS  CSV
--27588

VC Firms

Core Files:

VCFirms1980-present.rpt                                                   
VCFirms1980-present.ssh                                                   
VCFirms1980-present.txt                                                   
VCFirms1980-present.txt.org                                               
VCFirms1980-present-normal.txt                                            

Created: 6/16/2017 2:44:42 PM

Session Contents:

DATABASE: Private Equity Firms (VIFM)
Venture Related Deals: Select All Venture Related Deals
Firm Portfolio Cos: 1/1/1980 to 06/16/2017 (Custom) (Calendar)

Custom Report Contents:

FIRM_NAME
VI_YEARFOUND
VI_STREET3
VI_NATION_LONG
VI_NATION
FIRM_LAST_INV_DATE
FIRM_FIRST_INV_DATE
FIRM_NUM_COMPANIES
FIRM_NUM_ROUNDS
FIRM_TOTIN
FIRM_AREA_CODE
VI_CAPITAL_CITY
VI_CAPITAL_NATION
VI_CAPITAL_COUNTRY
VI_CAPITAL_STATE
VI_ADR_LINE1
VI_ADR_LINE2
VI_POSTAL_CODE
FIRM_CAP_MGT
VI_CITY
FIRM_COUNTY
FIRM_STATUS_DESC
FIRM_MSA_CODE
FIRM_GEOGRAPHY_PREF_DESC
FIRM_INDUSTRY_PREF_DESC
FIRM_ROLE_DESC
FIRM_STAGE_PREF_DESC
VI_STATE
VI_STREET1
VI_STREET2
FIRM_TYPE_DESC
FIRM_WREG1_DESC
VI_ZIP

LoadFirms.sql

DROP TABLE firmbase;
CREATE TABLE firmbase (
  firmname varchar(255),
  foundingdate date, --mm-dd-yyyy
  location varchar(100),
  nation varchar(100),
  nationcode varchar(10),
  datelastinv date, --mm-dd-yyyy
  datefirstinv date, --mm-dd-yyyy
  numportcos integer,
  numrounds integer,
  investedk money,
  areacode integer,
  bocity varchar(100),
  bonation varchar(100),
  bonationcode varchar(100),
  bostatecode varchar(100),
  boaddr1 varchar(100),
  boaddr2 varchar(100),
  bozip varchar(10),
  capitalundermgmt money,
  city varchar(100),
  county varchar(100),
  invstatus varchar(100),
  msacode varchar(10),
  geogpref varchar(100),
  indpref varchar(100),
  rolepref varchar(100),
  stagepref varchar(100),
  statecode varchar(2),
  addr1 varchar(100),
  addr2 varchar(100),
  type varchar(100),
  worldregion varchar(100),
  zip varchar(10)
);

\COPY firmbase FROM 'VCFirms1980-present-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS  CSV

LoadFirms.sql

DROP TABLE firmbase;
CREATE TABLE firmbase (
  firmname varchar(255),
  foundingdate date, --mm-dd-yyyy
  location varchar(100),
  nation varchar(100),
  nationcode varchar(10),
  datelastinv date, --mm-dd-yyyy
  datefirstinv date, --mm-dd-yyyy
  numportcos integer,
  numrounds integer,
  investedk money,
  areacode integer,
  bocity varchar(100),
  bonation varchar(100),
  bonationcode varchar(100),
  bostatecode varchar(100),
  boaddr1 varchar(100),
  boaddr2 varchar(100),
  bozip varchar(10),
  capitalundermgmt money,
  city varchar(100),
  county varchar(100),
  invstatus varchar(100),
  msacode varchar(10),
  geogpref varchar(100),
  indpref varchar(100),
  rolepref varchar(100),
  stagepref varchar(100),
  statecode varchar(2),
  addr1 varchar(100),
  addr2 varchar(100),
  type varchar(100),
  worldregion varchar(100),
  zip varchar(10)
);

\COPY firmbase FROM 'VCFirms1980-present-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS  CSV