Changes

Jump to navigation Jump to search
4,661 bytes added ,  00:31, 7 March 2012
no edit summary
==Rebuilding the Paper==
The paper requires required a complete rebuild of all the results, with the data updated to the end of 2011. We should can also consider several extensions to the paper, detailed in a later section.
===Main Acquisitions Data===
Acquisitions (from ====SDC):Search Criteria====*Events from 1980-2011 that meet the following SDC search criteria:*US Targets*Acquirer is publicly traded on the AMEX, Nasdaq or NYSEAnnounced: 1/1/1980 to 12/31/2011**Target is privately-held prior to acquisition (noteNation: new restriction - target was not an LBO)US**Acquisition is for 100% of the firmAcquiror Nation: US**Acqisition is complete before end of January 2012 Subsequent restrictionTarget Status: Drop acquisitions where market value of assets is negative or very small compared with the TV. Venture Capital Private (from VentureXpertV):*Portfolio companies that received VC from 1975-2011. Must not be LBOs.*LBOs from 1975 to 2011 to ensure that they are not in the control group of privately-held non-VC backed firms) Returns (from CRSP)Acquirer Status:*Stock returns for 1 year Public (250 Calendar daysP) for the acquirer, ending 30 days before the announcement. This will be the estimation window.*Market returns for the same period*Stock returns for 7 days beginning 3 days before the announcement and ending 3 days Percentage of Shares Owned after NoteTransaction: an observation must have 50 days of continuous trading in the estimation window, and be traded in the event window, 100 to be included.100 (will exclude those with missing data)
Accounting Data (From COMPUSTAT):*Various accounting variables for our acquirers, drawn for the year of the acquisition, and the lagged year for total assets.====SDC Variables====
===Supplementary Data===The following variables were pulled:
We need to rebuild the industry classification to update it to include NAICS2007 - this has largely been done in another of my papersYEARANN, but that work was for firms with patentsYEAREFF, and it is possible that some codes are still missing. To determine the information asymmetry ranking of sectors we will need (either for 1 year or across the entire year range): CRSP: *idiosyncratic volatility of stock returns: requires returns and mkt returns*relative trading volume (this appears to be called TURNOVERDA, as opposed to absolute volume which is VOLUME. The measure is relative to the exchange's trading volume I think...)*NAICDE, DATEANNORIG_DAYS, PCTACQ, PCTOWN, DAE, DATEEFFEXP, DUNCON, DAO, VALAMEND, VEST, STATC, VAL, ENTVAL, EQVAL, BIDCOUNT, CONSID_STRUCTURE, CONSID_STRUCT_DESC, CURRC, COUNT_CONSIDO, COUNT_CONSIDS, A_POSTMERGE_OWN_PCT, PCTOWN, PCT_STK, PCT_CASH, PCT_OTHER, PCT_UNKNOWN, AN, ANL, ANATC, ANAICP, AIN, ACU, ASTC, ASTIC, AIP, AUP, AEXCH, ACITY, AZIP, ALBOFIRM, TN, TIN, TCU, TLBOFIRM, TNL, TNATC, TNAICP, TSTC, STIC, TCITY, TZIP, IASS, COMEQ, BV, TASS, SALES, TASS, TLIA, RND, BNKRUPT, TWOSTEPSPIN, CHA, DBT_RESTRUCT, DUTCH, PRIVATIZATION, FBNK, RECAP, GOV_OWN_INVOLV_YN, JV, RESTR, LBO, LIQ, MOE, OMKT, IPO, REVERSE, RUM, SBO, SPIN, SPLIT
COMPUSTAT:*intangible assets*total assets*Tobin's Q: Market value/book value This provided (of assets*NAIC ===Raw Variables=== From SDC (for all acquisitions in the sampleparticular note):*Acquisition is completed indicator (as a check)*Acquisition percentage (as a check)
*Target Name
*Acquirer Name
*Payment Method
*Acquisition announcement date
*Acquisition announcement year
*Total assets of acquirer (if available)
*Payment method (cash/stock/mix)
*PC of stock in the deal
*No. of bidders
*Acquirer CUSIP (for join to COMPUSTAT)
*Target NAIC
*Acquirer NAIC (if available)
*Age (of target)
*Sales (of target)
*Intangible Assets (of target)
NotesNew Flags in SDC (downloaded for exclusions): convert all TVs in 2011 dollars*Bankruptcy Flag*Failed bank Flag*Leveraged Buyout Flag*Reverse LBO Flag*Spinoff Flag*Splitoff Flag*Target is a Leveraged Buyout Firm*And many others. These will be reviewed and excluded.
From COMPUSTAT ====Processing Notes==== #The data was imported into Postgres. There were 41,572 records#The flag variables were reviewed for variation - some had no bite (for both all acquirers e.g. Spinoff, TwoStepSpinOff, and Splitoff) and were ignored. Others led to data being discarded as flag exclusions.#All variables were checked for coding, range, dispersion, etc. The following were of particular note:#Restriction were placed on the universe of firmsdata (Completion, flags, exclude LBOs):etc. This reduced the data to 40,035 observations*Total Assets #Certain variables were reprocessed (in year see below)#Acquiror and 1 year lagged)Target names were keyed to account for repetitions etc.*Market Value #Duplicate acquisition data (SHROUT*Price at start of same event window)was eliminated*Sales#Multiple acquisition of the same target (i.e. a target is acquired, spun-off and acquired again, etc) were eliminated.*Leverage #CUSIPs were processed into 6, 8, and 9 digit variables , by search COMPUSTAT annual data (RevenueJan 1978 - Jan 2012) using the 6 digit CUSIP and then finding the correct 9 digit CUSIP for a particular issue-year. Note that a 9 digit CUSIP is a 6 digit Issuer Number, Variable Costa 2 digit Issue Number, Op Incomeand a check digit. There were 27, Net Income401 acquisitions with 7,348 valid CUSIPs.#CRSP data was retrieved and processed (see below). After processing we had data for 23, 802 observations.#COMPUSTAT data was retrieved and processed (see below)#VC PortCo data was retrieved and processed. PortCos were flagged and portco data added for appropriate observations.#LBO data was retrieved and processed. 164 observations were discarded.#Acquisition Histories were calculated as number of past acquisitions for each acquirer: Total Liabilities, Stockholder's equityVC only, Non-VC only*Accounting vars were converted to 2011 real values using the official BEA implicit GDP price deflator index: http://www.bea.gov/national/nipaweb/TableView.asp?SelectedTable=13&ViewSeries=NO&Java=no&Request3Place=N&3Place=N&FromView=YES&Freq=Year&FirstYear=1978&LastYear=2010&3Place=N&Update=Update&JavaBox=no)*Intangible Assets#Percentage variables were multiplied by 100 to get nice coefficients#Every observation was assigned a unique observation number (obsno)*NAIC#Compound variables such as ''horiz'',''vert'', and ''cong'' were calculated.
From VentureExpert (all VC backed firmsVariable check notes:*1, and all LBOs seperately):514 had estimated announce dates. These were flagged.*VC (binary variable)443 had their transaction value amended. These were flagged.*VE industry classification (to use as 41,473 had a reference set deal code of 'C' for completed. These were kept.*The number of bidders was always disclosed and 2 in 54 cases and 3 in 2 cases. *Number of considerations offered and sought varied from 1 to update the industry classification)8*LBO (binary variable) to exclude these from the control groupState codes were USPS official standards: https://www.usps.com/send/official-abbreviations.htm*If we add one or more extension (see below)There was data from 35 stock exchanges. 32, then we'll need a fully flushed out VE database build including portcos, rounds, deals, funds177 observations recorded Amex, Nasdaq or NYSE.*25 acquirors were LBO firmsand 3 targets were LBOs These were excluded.*All acquirors and targets had 6 digit NAIC codes, though some were truncated e.g. 517000 and possibly executivesothers invalid. COMPUSTAT NAIC codes were used when SDC NAIC codes failed when these were recorded in WRDS.
===Calculated variables===Flag Exclusions:*Cases where the target was bankrupt or distressed as indicated by: TargetBankrupt, TargetBankInsolvent, Liquidation, Restructuring.*Cases where the form wasn't genuinely privately-held as indicated by: OpenMarketPurchases, GovOwnedInvolvement, JointVenture, Privatization (which capture government sales).*Cases where there was a share recap going on concurrently with the acquisition: Recap*Targets that had LBO involvement (more will likely be removed in the next phase of matching to LBO targets): LBO, SecondaryBuyoutFlag, ReverseTakeOver (used for LBO'd firms doing a reverse take over), IPOFlag (likewise).*Firms where the deal began as a rumor (so the information leakage is problematic): DealBeganAsRumor
ReturnsProcessing of variables:*<math> AR_i = R_i- (\hatThe original announce date was determined as min{\alpha_iannouncedate, announcedateorg} + . Those where annoucedate \hat{\beta_i}R_m) </math>ne announcedateorg were flagged.*<math> AR^S_i = R_i - R_m </math>*Let <math>\epsilon</math> be the residual Percentage stock, cash, other and unknown were reprocessed to include data from the mkt model regressionConsidStruct field, which tags Stock Only, Cash Only, etc. Then calc: <math>\sigma_{\epsilon}={( \mathbb{E}(\epsilon - \mathbb{E} \epsilon))}^{\frac{1}{2}}</math>*RMSE of State codes were reprocessed to numerics using the Mkt Model: <math>RMSE={lookup table below*IT, BT ( \mathbb{E}Biotech), HT (X- \mathbb{E} XHightech))}^{\frac{1}{2}}</math> - this is in and NAIC1, NAIC2, NAIC3, Indu1, Indu2, Indu3 variables were created using the lookup tables below (see the ereturn list in STATA and will be used variable descriptions for the Patell Standard Errorsmore info).*The cummulative return <math>CAR_i = \sum_t AR_i</math>*Check Note that the Boehmer standard errors are IT, BT and HT variables were coding using aggregate codes whereever possible (i.e. 517110, etc, all appear in IT and cover the cross-sectional ones generated by OLS.*Check 517 code entirely, so the specification of 517 block would be coded as IT even if SDC recorded the McKinley standard errorscode as 517000.
For Other Notes:*Founding year/Age of the tables we need ARs Target was not available in the data. It is in VE for 2VC-backed only.*The street address is multiline and problematic if included. This can be drawn seperately if needed. We have the City, Zip and State, which is sufficient to get a Google Maps lookup. Likewise 'Competing Offer Flag (Y/N)',3 also known as COMPETE and 7 dayCompeting Bidder, where 2 day is days 0 & 1a multiline - with each presumably corresponding to a different bidder identity. It was excluded.*The NormalizeFixedWidth.pl script uses the spacing in the header to determine the column breaks. The EquityValue column has two spaces in front of its name that screws this. Both EquityValue and EnterpriseValue needed to be imported as varchar(10), as they have the code 'np' in some observations.*The NormalizeFixedWidth.pl script was modified so that it only drops commas in numbers and others are symmetricnot those in names etc.
For robustness we need ARs for 5,9, and 11 days.
SDC:*No of past acquisitions for each acquirer: Total, VC only, Non-VC only*Target is VC/Non-VC*Acq is Horizontal (same 6 digit), Vertical (same 2 digit/ITBT), Conglomerate (other), and Related (not cong.)*3dg NAIC for controls*IT/BT/HT and 1dg-NAIC, 2dg-NAIC, other classification. Applied to targets and acquirers.===CRSP Data===
Dataset level calculationsDaily return data was downloaded using 8 digit CUSIPs from CRSP. The following variables were retrieved from 1/1/1978-1/1/2012 (the latest month available):*Boom: <math>1990\le year \le 1999</math>Cusip*Date*prc*ret*Leverage: <math>\frac{Total\;Liabilities}{Total\;Assets}</math>vwretd
The data was processed:#Announcedays were coded to the current or next following trading day. #Trading days were indexed from the announcement day (day 0) for all announcement-cusip pairs.#A refined estimation set beginning 280 and ending 30 days before the acquisition was extracted for each announceday-cusip pair#Cusips with multiple announcements on the same day had these announcements flagged and a unique announceday-cusip pair index (acqno) was createdannounceday-cusip pair observation were included in an estimation regression provided that there were 50 continuous trading days ending at day -30.*The parameters. errors and statistics from the regression, particularly <math>\hat{\alpha_i},\hat{\beta_i}</math>, were estimated for each announceday-cusip pair in the following regression:<math>R_i = \hat{\alpha_i} + \hat{\beta_i}R_m + \epsilon</math>*Days from -5 to +5, to allow for an 11 day window, were extracted into an event window and processed to produce:**<math> AR_i =R_i- (\hat{\alpha_i} + \hat{\beta_i}R_m) </math>**<math> AR^S_i =Extending R_i - R_m </math>**Let <math>\epsilon</math> be the residual from the papermkt model regression. Then calc: <math>\sigma_{\epsilon}={( \mathbb{E}(\epsilon - \mathbb{E} \epsilon))}^{\frac{1}{2}}</math>**RMSE of the Mkt Model: <math>RMSE={( \mathbb{E}(X- \mathbb{E} X))}^{\frac{1}{2}}</math> - this is in the ereturn list in STATA and will be used for the Patell Standard Errors.*Then other variables were calculated or included**The cummulative return <math>CAR_i = \sum_t AR_i</math>**THe price 30 days before the acquisition was recorded for the market value calculation
Coming back to it, the paper looks a little thin (though clearly the data is a monster already). I think it would benefit from a couple of extensions, particularly the inclusion of something that resembles an instrument. I have the following ideas, which might be feasible in the time we have:===COMPUSTAT Data===
From COMPUSTAT we drew accounting variables for all of our Cusips, then extracted data for the announcement years and the lagged announcement years. (Note: The defacto standard method of determining that Cusip, NAIC, datayear, fiscal year and fiscal year end were included in the lead investor is download. NAIC was used to see which (if any) investor was present from the first roundsupplement SDC NAICs.)
===Using Patents===Data included:*Total Assets*Market Value*Sales*Total Liabilities*Intangible Assets*Shares Outstanding
Patents might act to certify their patent-holders in the face of information asymmetries (see, for example, Hsu and Ziedonis, 2007). Thus firms with acquirers of targets with patents might value the certification of a venture capitalist less than when they consider targets without patents. Likewise, on average about 2Note that leverage was calculated as:<math>Leverage=\frac{Total\;Liabilities}{Total\;Assets}</3rds of all patent citations are added by examiners (Alcacer and Gittelman, 2006 and Cotropia et al., 2010). Thus citation counts might represent the search costs associated with finding information about patents. That is, patents with more citations are the ones that are easiest to find, and so mitigate information asymmetries the most successfully.math>
At present I have the 2006 NBER patent data loaded up in a database. I could add in patents and citations up Variables were translated to 2006 with a day or two of work. I am working on the 2011 update to the NBER patent data dollars and marked ''varname11'', lagged (see: http://www.nber.com/~edegan/w/indexminus one year) variables were recorded as ''varname_m1''.php) but this will NOT be done before the March 7th deadlineIn STATA log variables were created as ''varnamel''.
===The VC ReputationsPortCos===
We argue, explicitly, that VCs use their reputations The following criteria was applied to certify thier firms. We can calculate the defacto standard measures of reputation - the number of IPOs and the total number of successful exits, and use these to instrument our effectsSDC search:*Moneytree deals (i. This could be done for either the lead investor, or the most successful investor, or a weighted average of all investors (weighting by the number of rounds they participated in, or the proportional dollar value they may have provided)e. Likewise we can calculate the number of funds the lead investor had successfully raised at the time of the exit, or the average number of funds raised across all investors (again perhaps with a weightingVC only). *Company Nation: US*Round date: 1/1/1975 to 1/1/2012
===A basic variable set was downloaded including:*PortCo Name*Nation*State*Location*Address*Total VC Information Asymmetries===Invested*Date of First Inv*Date of Last Inv*Date of Founding*No Rounds
Implicit in our argument is that VCs mitigate the information asymmetries between themselves and their portfolio firms effectively. We can refine this argument to consider the degree to which a VC is likely to be informed about their porfolio firm.Check flags:*Moneytree*Venture Related
====Distances====The data was reprocessed, specifically:*Unique PortCo Names were determined using Names, States and Location data to determine unique portcos. *Duplicate records were eliminated*Discontinuous (multiple) records (pertaining to the financing history of a single firm) were assembled into single records*PortCos were matched to Acquisition Targets using name based matching, checking state and location information.*In a small number of cases VC appears to continue after the acquisition. This is almost surely an error in VE, but these obserations are flagged.
We can use the road or great-circle distance from the lead investor to the portfolio company as a measure of the information acquisition cost. We could also create a cruder but likely more meaningful version Note: The coverage of this by creating a binary variable to see whether the lead investor was within a 20-minute drive of the portfolio company (this VE before 1980 is the problematic, so called '20 minute rule' - discussed as important for monitoring we will discard acquisition records before 1985 in Tian, 2006). Alternatively we could consider the nearest investor, or STATA before the average of the nearest investors across all rounds, etcanalysis.
I can get 2,500 requests per IP address (I can run 3+ concurrently from Berkeley) from the Google Maps api, with responses including driving distances and estimated driving times.===Removing additional LBOs===
A set of LBO portcos were downloaded from SDC using the flags LBO=yes, PWCMoneytree=No, StdUSVentureDisbursement==Active Monitoring====NoThe LBOs were matched against the acquisition targets and removed (LBO initial investment dates were checked).
I can also determine whether the lead VC has a board seat at the portfolio company at the time of the acquisition, as well as the fraction of invested firms with board seats, and the total number of board sets held by VCs (or the fraction), using the identities of the executives. Though this will be particularly difficult in terms of data, I plan on doing it for another project with Toby Stuart anyway.===Processing NAIC Codes===
==Rebuild Plan==While SDC provide 6 digit NAIC codes for all acquirers, some of these NAIC codes are invalid (proprietary to SDC). These were replaced with COMPUSTAT NAIC codes whenever available. The SDC NAIC codes found were:
I suggest that I leave the rebuild of the supplementary information asymmetry dataset (to show that that IT has greater information asymmetries than other sectors) until the end. It is a lot of work SDCnaic | SDCindustry ------------+------------------------------------------------ BBBBBA | Miscellaneous Retail Trade BBBBBA | Business Services BBBBBA | Advertising Services BBBBBA | Prepackaged Software BBBBBB | Business Services BCCCCA | Investment & Commodity Firms,Dealers,Exchanges BCCCCD | Investment & Commodity Firms,Dealers,Exchanges BCCCCD | Business Services BCCCCD | Social Services BCCCCE | Investment & Commodity Firms, both in terms of assembly time and run-time to do the regressionsDealers, and we can use the existing table for the next version if need be. I suspect that this component will take me 3 days on its own.Exchanges
The regressions for Details of the estimation window will have a run-time that might be considerable; even given the hardware that I have put together at BerkeleyIT, BT, I suspect that this will take at least 24hrs of compute time. I therefore plan on doing this very early and setting it runningHT codes are below.
An acquisition was classified as:*''Vert'' if ''acquirornaic6''=''targetnaic6''*''Horiz'' if ''acquirornaic6''=''targetnaic6'' AND ''acquirornaic5''!=Proposed Rebuild Order===''targetnaic5''*''Cong'' if !''Vert'' AND !''Horiz''*''Related'' if ''Vert'' OR ''Horiz''
My order is therefore:#Download, clean, and process the SDC data so that it can be joined to CRSP (and the other data sources)#Download the CRSP data for the estimation and event windows. Set the estimation windows running. Build the event window code while they run, and otherwise move forward.#Download the VentureXpert data. Pull the portco data first, so we can construct the binary indicator.#Update the industry classification, using the old one, my new one, and VentureXpert as a reference set.#Download an LBO dataset, so we can remove these firms.#Download the COMPUSTAT data, and join it to the SDC data. At this point we should have everything we need to get the basic analysis up and running again.#Build out the a full database of VC investments into these portcos so we can calculate distances, monitoring through board positions and reputations. Stop short of actually doing the build of these variables.#Download the GNI IPO data to calculate the standard reputation measures and join it up, then calculate these measures.#Calculate the distances for all VCs to all acquired targets. Determine lead VCs if feasible and calculate the distance measures. #Add in the NBER patent data to 2006, include the number of patents and patents weighted by citations-received (not corrected for truncation)?#Rebuild the "Information Asymmetry (IA) by Industry" data.===Patent Data===
===Time Estimates===NBER patent data with assignee names from 1975-2006 was used to add patent counts to the data. Only patents filed before the annoucement date were included. Assignee names were matched to target names by name matching software, with matches validated by hand. A patent count and 'has patents' variable (''patents'') were generated and a flag was added to recorded that have their acqusition announcement before 2006. Targets acquired after 2006 have their patent applications up to and including 2006 recorded, though these numbers will as their true counts are right-truncated. Likewise, a target may have existed and made patent applications prior to 1975, resulting in left-truncation. Therefore year fixed-effects are warranted.
My time estimates are going to be wild for three reasons: It is just really hard to estimate some of these things (the time goes into the things that you don't anticipate being a problem but are); some of my skills are rusty, ====Patents and on the flip-side I now have some serious hardware to throw at this; and I'm currently recovering from some health problems. However, my best ballpark is:#1 day#2 days#1/2 day#1 day#1/2 day#1 day + 2 days to get everything together into a dataset for analysis#2 days#1 day#3 days#2 days#3 daysInformation Asymmetries====
By Patents might act to certify their patent-holders in the end face of step 6information asymmetries (see, which I think will take 8 daysfor example, I should have a the original data rebuilt Hsu and analyzed againZiedonis, 2007). To get steps 7-9, which would give us two good extensions to Thus firms with acquirers of targets with patents might value the datacertification of a venture capitalist less than when they consider targets without patents. Likewise, would add another 6 days. The on average about 2/3rds of all patent data extension citations are added by examiners (if wantedAlcacer and Gittelman, 2006 and Cotropia et al., 2010) would add another 2. Thus citation counts might represent the search costs associated with finding information about patents. That is, patents with more citations are the ones that are easiest to find, and then so mitigate information asymmetries the rebuild of IA data is guesstimated at another 3most successfully.
There are 16 calendar days between now and March 7th (excluding the 7th). Note: I am going working on the 2011 update to lose 2 to a course that I'm taking, and 1 to health-carethe NBER patent data (see: http://www.nber. That leaves 13, which is one short of the 8+6 for the 2 extensionscom/~edegan/w/index. I will probably also need one or two days off (I just can't keep working 7 day weeksphp), but nevertheless, it looks like I should this will NOT be able to complete done before the basic rebuild in time, and perhaps (if things go well) add an extension or twoMarch 7th deadline.
==Rebuild =Analyis Calculations and Notes===
The following is performed on the dataset before analysis:*Observations were dropped if yearann<1985, to give 5 years of VC data before the announcement*asize =market value + lagged liabilities*rsize ==Thoughts tv/asize*log variables were calculated as log(1+var)*The following aliases were created:**tit -> it**tbt -> bt**tht -> ht (?)**yearann -> year*Interaction effect variables were created*Year x anaic2 (2 digit acquiror naic) fixed effect indicators were created*CARM variables (Market Model CAR) were created for the 7 day window for the figures*Vscore variables were created for discussionthe significance tests on CARS using the RMSE from the estimation window: <math>vscore =abs\left(\frac{carm}{\left(\frac{rmse}{\sqrt{n}}\right)}\right)</math>. This was done on a per group basis, using the variable names xgroupvar, where group=it or itvc or null.*Year range variables from 1 to 6 were created for years 1985-1989, ... , 2005-2009, 2010-2011.*In the regression analysis we clustered standard errors on acqno (the cusip-announceday pair that could have multiple acquisitions, marked with ''sameday''=1), using STATA's vce(cluster ''clustvar'') documented here: http://www.stata.com/support/faqs/stat/robust_ref.html
Notes:
#The experience variables (# Previous Acqs) are generated using the primary data, and will be truncated by the start of the dataset. We should probably consider year fixed effects to mitigate any induced bias.
#In the previous version of the paper we threw out cases when the mkt value of the acquirer was 'very small' relative to TV.
#Boom is defined as: <math>1990\le year \le 1999</math>
#The Boehmer standard errors are the cross-sectional ones generated by OLS. Clustering them isn't part of the specification, but clearly should be done.
===Downloading the Acquisitions=Supplementary Data==
Basic Criteria:*US Targets*Announced: To determine the information asymmetry ranking of sectors again we will need (either for 1/1/1980 to 12/31/year or across the entire year range 1985-2011*Target Nation: US*Acquiror Nation: US*Target Status: Private (V)*Acquirer Status: Public (P)*Percentage of Shares Owned after Transaction: 100 to 100 (will exclude those with missing data)
CRSP: *idiosyncratic volatility of stock returns: requires returns and mkt returns*relative trading volume (this appears to be called TURNOVER, as opposed to absolute volume which is VOLUME. The completed deal flag is in ''Deal Status'' - this will measure should be restricted relative to the exchange'C' in the processing.s trading volume)*NAIC
New Flags in SDC (downloaded for exclusions)COMPUSTAT:*Bankruptcy Flag*Failed bank Flag*Leveraged Buyout Flag*Reverse LBO Flagintangible assets*Spinoff Flagtotal assets*Splitoff Flag*Target is a Leveraged Buyout Firm*And many others. These will be reviewed and excluded. Founding year/Age of the Target was not available in the data. It is in VE for VC-backed only. The street address is multiline and problematic if included. This can be drawn seperately if needed. We have the City, Zip and State, which is sufficient to get a Google Maps lookup. Likewise 'Competing Offer Flag (Y/N)Tobin', also known as COMPETE and Competing Bidder, is a multiline - with each presumably corresponding to a different bidder identity. It was excluded. The NormalizeFixedWidth.pl script uses the spacing in the header to determine the column breaks. The EquityValue column has two spaces in front of its name that screws this. Both EquityValue and EnterpriseValue needed to be imported as varchar(10), as they have the code 'np' in some observations. The NormalizeFixedWidth.pl script was modified so that it only drops commas in numbers and not those in names etc. ===Processing the Acquisitions=== A large number of 'new' flags are now available in SDC. Most of them have no bite on our data. But I have excluded the followings Q:*Cases where the target was bankrupt or distressed as indicated by: TargetBankrupt, TargetBankInsolvent, Liquidation, Restructuring.*Cases where the form wasn't genuinely privately-held as indicated by: OpenMarketPurchases, GovOwnedInvolvement, JointVenture, Privatization (which capture government sales).*Cases where there was a share recap going on concurrently with the acquisition: Recap*Targets that had LBO involvement (more will likely be removed in the next phase of matching to LBO targets): LBO, SecondaryBuyoutFlag, ReverseTakeOver (used for LBO'd firms doing a reverse take over), IPOFlag (likewise).*Firms where the deal began as a rumor (so the information leakage is problematic): DealBeganAsRumor All together, these constraints reduce us from 41,572 to 40,306 acquisitions. Constraints that had no bite included: Spinoff, TwoStepSpinOff, and Splitoff. Further restricting the data to completed transactions, and those with valid codes for when the transaction Market value was amended, reduces the data to 40,035. Acquiror and Target names were keyed into unique names. The first acquisition of several was taken. 33 records had multiple entries, the correct one of which could not be determined. These were discarded.  ===Retrieving CUSIPs=== The acquisitions data lists 6 digit CUSIPs, we need the 'correct' (the right issue for the right period) 8 or 9 digit CUSIP with which to search CRSP and COMPUSTAT. A full list of all CUSIPs was retrieved from COMPUSTAT for the period Jan 1978 - Jan 2012 using the annual data.  ===Downloading the VC PortCos=== The following criteria was applied:*Moneytree deals (i.e. VC only)*Company Nation: US*Round date: 1/1/1975 to 1/1/2012 Total book value of 30364 records. Note that the coverage of VE before 1980 is problematic. Selected fields:*Name*Nation*State*Founding Dateassets*Total $Inv*No Rounds*Address info (various fields)*Date first inv*Date last inv Check flags:*Moneytree*Venture Related ===Implicit Price Deflators=== Accounting vars were converted to 2011 real values using the official BEA implicit GDP price deflator index: http://www.bea.gov/national/nipaweb/TableView.asp?SelectedTable=13&ViewSeries=NO&Java=no&Request3Place=N&3Place=N&FromView=YES&Freq=Year&FirstYear=1978&LastYear=2010&3Place=N&Update=Update&JavaBox=no) ==To Do==*Log accounting vars.*Acq is material (Can't find defn)? Also - throw out :*Market Value of acquiror is negative*Mkt Value 'very small' relative to TVNAIC
==Variables==
The following is quick description of the variables in the Version 1 3 dataset, in order.
===Acquisition Specific Variables===
*patentdata: Takes the value 1 if the announcement year equal to or less than 2006, so the firm can have all of its patents recorded (from 1975 forward), and 0 if the patent data will be inherently truncated.
The references for ==State Codes== We use the other High-Tech US Postal Service (HTUSPS) definitions are:*Hecker, Daniel E.(2005), "High-technology employment: a NAICS-based update"Official State Codes, Monthly Labor Review (July)found at: 57-72. httphttps://www.blsusps.govcom/opubsend/mlr/2005/07/art6fullofficial-abbreviations.pdf htm*Paytas, Jerry and Berglund, Dan (2004), "Technology Industries and Occupations for NAICS Industry Data", Carnegie Mellon University, Center for Economic Development and OfficialCode NumericCode State Science & Technology Institute. AK 1 ALASKA AL 2 ALABAMA AR 3 ARKANSAS AS 4 AMERICAN SAMOA AZ 5 ARIZONA CA 6 CALIFORNIA CO 7 COLORADO CT 8 CONNECTICUT DC 9 DISTRICT OF COLUMBIA DE 10 DELAWARE FL 11 FLORIDA FM 12 FEDERATED STATES OF MICRONESIA GA 13 GEORGIA GU 14 GUAM GU HI 15 HAWAII IA 16 IOWA ID 17 IDAHO IL 18 ILLINOIS IN 19 INDIANA KS 20 KANSAS KY 21 KENTUCKY LA 22 LOUISIANA MA 23 MASSACHUSETTS MD 24 MARYLAND ME 25 MAINE MH 26 MARSHALL ISLANDS MI 27 MICHIGAN MN 28 MINNESOTA MO 29 MISSOURI MP 30 NORTHERN MARIANA ISLANDS MS 31 MISSISSIPPI MT 32 MONTANA NC 33 NORTH CAROLINA ND 34 NORTH DAKOTA NE 35 NEBRASKA NH 36 NEW HAMPSHIRE NJ 37 NEW JERSEY NM 38 NEW MEXICO NV 39 NEVADA NY 40 NEW YORK OH 41 OHIO OK 42 OKLAHOMA OR 43 OREGON PA 44 PENNSYLVANIA PR 45 PUERTO RICO PW 46 PALAU RI 47 RHODE ISLAND SC 48 SOUTH CAROLINA SD 49 SOUTH DAKOTA TN 50 TENNESSEE TX 51 TEXAS UT 52 UTAH VA 53 VIRGINIA VI 54 VIRGIN ISLANDS VT 55 VERMONT WA 56 WASHINGTON WI 57 WISCONSIN WV 58 WEST VIRGINIA WY 59 WYOMING 99 UNKNOWN
==NAIC CodesClassifiction of IT, BT and HT==
===Information and Communications Technology (IT)===
6215 both 621512 Diagnostic Imaging Centers
===Other High Tech (HT)===
The following is our definition of other (i.e. Not IT/BT) High-tech:
211 both 211111 211111 Crude Petroleum and Natural Gas Extraction
811211 both 811211 811211 Consumer Electronics Repair and Maintenance
811219 both 811219 811219 Other Electronic and Precision Equipment Repair and Maintenance
 
===Other HT Definitions===
 
The references for the other High-Tech (HT) definitions are:
*Hecker, Daniel E.(2005), "High-technology employment: a NAICS-based update", Monthly Labor Review (July): 57-72. http://www.bls.gov/opub/mlr/2005/07/art6full.pdf
*Paytas, Jerry and Berglund, Dan (2004), "Technology Industries and Occupations for NAICS Industry Data", Carnegie Mellon University, Center for Economic Development and State Science & Technology Institute.
 
==Extending the paper==
 
Once this 'draft' is complete we can consider some extentions. I am currently working on the VC Reputations data.
 
===VC Reputations===
 
VCs might use their reputations to certify their firms, or these variables might reflect VC experience (and potentially bargaining skill). We can calculate:
*Avg or max number of previous acquisitions and/or IPOs conducted by VCs present in the last round of investment into the firm prior to the acquisition announcement.
*The number of previous acquisitions and/or IPOs conducted by the lead VC (Note: The defacto standard method of determining the lead investor is to see which (if any) investor was present from the first round in every round until the last.)
*Likewise for the average or dollar-invested weighted average of all investors in the port co.
*Last round, lead investor, or average number of previous funds raised by investors, or their fund size or total cummulative firm size (i.e. summed across all funds) at the announcement.
*Whether the VCs will raise a next fund (though this could actually be endogenuous with the CAR)
 
===Outside Options===
 
Outside options affect bargaining. A VC that is near to the end of its fund when it made it's last investment into the portfolio company (either in terms of dates or dollars), and particularly one that won't raise a next fund, will be unable to continue financing the portco without the acquisition and therefore has no good outside option with which to bargain. We could calculate how near last round investors are to the ends of their funds (and whether they are going to raise another) and take averages etc, to proxy for the outside option.
 
===Bargaining Superstars===
 
It might be the case that some VCs specialize in providing bargaining skills. We could test this hypothesis by:
*Creating fixed-effect variables for the presence of each repeat VC in a portfolio company
*Regressing these fixed-effects on the CARs and sorting the coefficient into quartiles/deciles etc.
*Testing the hypothesis that firms in the top decile are more likely than expected to appear in a last round of financing.
 
===VC Information Asymmetries===
 
Implicit in our argument is that VCs mitigate the information asymmetries between themselves and their portfolio firms effectively. We can refine this argument to consider the degree to which a VC is likely to be informed about their porfolio firm.
 
====Distances====
 
We can use the road or great-circle distance from the lead investor to the portfolio company as a measure of the information acquisition cost. We could also create a cruder but likely more meaningful version of this by creating a binary variable to see whether the lead investor was within a 20-minute drive of the portfolio company (this is the so called '20 minute rule' - discussed as important for monitoring in Tian, 2006). Alternatively we could consider the nearest investor, or the average of the nearest investors across all rounds, etc.
 
I can get 2,500 requests per IP address (I can run 3+ concurrently from Berkeley) from the Google Maps api, with responses including driving distances and estimated driving times.
 
====Active Monitoring====
 
I can also determine whether the lead VC has a board seat at the portfolio company at the time of the acquisition, as well as the fraction of invested firms with board seats, and the total number of board sets held by VCs (or the fraction), using the identities of the executives. Though this will be particularly difficult in terms of data, I plan on doing it for another project with Toby Stuart anyway.
Anonymous user

Navigation menu