VC Acquisitions Paper

From edegan.com
Revision as of 17:45, 27 February 2012 by imported>Ed
Jump to navigation Jump to search

This page details the work rebuilding Brander Egan (2007) - The Role of VCs in Acquisitions for our submission to the RCFS special issue and associated conference.

Connect to the database with:

psql -h 128.32.252.201 -U ed_egan Acqs

Submission Details

The Third Entrepreneurial Finance and Innovation Conference on June 10th-11th in Boston, MA, is supported by the Kauffman Foundation and the Society for financial studies. Conference papers will be considered for inclusion in a special issue of the Review of Corporate Finance Studies.

The conference details are here: http://sites.kauffman.org/efic/overview.cfm

The deadline for submission is March 7th, 2012, though earlier submission is encouraged. Authors will be notified if their paper has been selected by the end of April.

The program committee includes: Thomas Hellmann, Adam Jaffe, Bill Kerr, Josh Lerner, David Robinson, Morten Sorenson, Bob Strom, and others.

Errors in the existing version

The Dierkins 1991 reference is missing:

@article{dierkens1991information,
  title={Information asymmetry and equity issues},
  author={Dierkens, N.},
  journal={Journal of Financial and Quantitative Analysis},
  volume={26},
  number={2},
  pages={181--199},
  year={1991},
  publisher={Cambridge Univ Press}
}

The Boehmer reference has a typo - the second author is Musumeci. Also, in para 2, p.19, I think it was McKinley that "suggest[ed] a method that combines both cross-sectional and time-series information..."

Other points:

  • There were a few other typos.
  • Also the GX paper gets very little mention - I thought we had a whole subsection devoted to them...
  • I was surprised that we didn't have a year fixed-effect variable in the main analysis (though we have Boom, which is more interesting)

Rebuilding the Paper

The paper requires a complete rebuild of all the results, with the data updated to the end of 2011. We should also consider several extensions to the paper, detailed in a later section.

Main Data

Acquisitions (from SDC):

  • Events from 1980-2011 that meet the following criteria:
    • Acquirer is publicly traded on the AMEX, Nasdaq or NYSE
    • Target is privately-held prior to acquisition (note: new restriction - target was not an LBO)
    • Acquisition is for 100% of the firm
    • Acqisition is complete before end of January 2012

Subsequent restriction: Drop acquisitions where market value of assets is negative or very small compared with the TV.

Venture Capital (from VentureXpert):

  • Portfolio companies that received VC from 1975-2011. Must not be LBOs.
  • LBOs from 1975 to 2011 to ensure that they are not in the control group of privately-held non-VC backed firms)

Returns (from CRSP):

  • Stock returns for 1 year (250 Calendar days) for the acquirer, ending 30 days before the announcement. This will be the estimation window.
  • Market returns for the same period
  • Stock returns for 7 days beginning 3 days before the announcement and ending 3 days after

Note: an observation must have 50 days of continuous trading in the estimation window, and be traded in the event window, to be included.

Accounting Data (From COMPUSTAT):

  • Various accounting variables for our acquirers, drawn for the year of the acquisition, and the lagged year for total assets.

Supplementary Data

We need to rebuild the industry classification to update it to include NAICS2007 - this has largely been done in another of my papers, but that work was for firms with patents, and it is possible that some codes are still missing.

To determine the information asymmetry ranking of sectors we will need (either for 1 year or across the entire year range):

CRSP:

  • idiosyncratic volatility of stock returns: requires returns and mkt returns
  • relative trading volume (this appears to be called TURNOVER, as opposed to absolute volume which is VOLUME. The measure is relative to the exchange's trading volume I think...)
  • NAIC

COMPUSTAT:

  • intangible assets
  • total assets
  • Tobin's Q: Market value/book value of assets
  • NAIC

Raw Variables

From SDC (for all acquisitions in the sample):

  • Acquisition is completed indicator (as a check)
  • Acquisition percentage (as a check)
  • Target Name
  • Acquirer Name
  • Transaction Value
  • Payment Method
  • Acquisition announcement date
  • Acquisition announcement year
  • Total assets of acquirer (if available)
  • Payment method (cash/stock/mix)
  • PC of stock in the deal
  • No. of bidders
  • Acquirer CUSIP (for join to COMPUSTAT)
  • Target NAIC
  • Acquirer NAIC (if available)
  • Age (of target)
  • Sales (of target)
  • Leverage (of target)
  • Intangible Assets (of target)

Notes: convert all TVs in 2011 dollars.

From COMPUSTAT (for both all acquirers and for the universe of firms):

  • Total Assets (in year and 1 year lagged)
  • Market Value (SHROUT*Price at start of event window)
  • Sales
  • Leverage variables (Revenue, Variable Cost, Op Income, Net Income, Total Liabilities, Stockholder's equity)
  • Intangible Assets
  • NAIC

From VentureExpert (all VC backed firms, and all LBOs seperately):

  • VC (binary variable)
  • VE industry classification (to use as a reference set to update the industry classification)
  • LBO (binary variable) to exclude these from the control group
  • If we add one or more extension (see below), then we'll need a fully flushed out VE database build including portcos, rounds, deals, funds, firms, and possibly executives.

Calculated variables

Returns:

  • [math] AR_i = R_i- (\hat{\alpha_i} + \hat{\beta_i}R_m [/math]
  • [math] AR^S_i = R_i - R_m [/math]
  • Let [math]\epsilon[/math] be the residual from the mkt model regression. Then calc: [math]\sigma_{\epsilon}={( \mathbb{E}(\epsilon - \mathbb{E} \epsilon))}^{\frac{1}{2}}[/math]
  • RMSE of the Mkt Model: [math]RMSE={( \mathbb{E}(X- \mathbb{E} X))}^{\frac{1}{2}}[/math] - this is in the ereturn list in STATA and will be used for the Patell Standard Errors.
  • The cummulative return [math]CAR_i = \sum_t AR_i[/math]
  • Check that the Boehmer standard errors are the cross-sectional ones generated by OLS.
  • Check the specification of the McKinley standard errors.

SDC:

  • No of past acquisitions for each acquirer: Total, VC only, Non-VC only
  • Target is VC/Non-VC
  • Acq is Horizontal (same 6 digit), Vertical (same 2 digit/ITBT), Conglomerate (other), and Related (not cong.)
  • 3dg NAIC for controls
  • IT/BT/HT and 1dg-NAIC, 2dg-NAIC, other classification. Applied to targets and acquirers.

Dataset level calculations:

  • Boom: [math]1990\le year \le 1999[/math]
  • Leverage:
    • Finanial leverage is [math]\frac{Op.\;Income}{Net\;Income}[/math]
    • Operating leverage is [math]\frac{Revenue - Variable\;Cost}{Op.\;Income}[/math]
    • I think we used: [math]\frac{Total\;Liabilities}{Equity}[/math]

Extending the paper

Coming back to it, the paper looks a little thin (though clearly the data is a monster already). I think it would benefit from a couple of extensions, particularly the inclusion of something that resembles an instrument. I have the following ideas, which might be feasible in the time we have:

(Note: The defacto standard method of determining the lead investor is to see which (if any) investor was present from the first round.)

Using Patents

Patents might act to certify their patent-holders in the face of information asymmetries (see, for example, Hsu and Ziedonis, 2007). Thus firms with acquirers of targets with patents might value the certification of a venture capitalist less than when they consider targets without patents. Likewise, on average about 2/3rds of all patent citations are added by examiners (Alcacer and Gittelman, 2006 and Cotropia et al., 2010). Thus citation counts might represent the search costs associated with finding information about patents. That is, patents with more citations are the ones that are easiest to find, and so mitigate information asymmetries the most successfully.

At present I have the 2006 NBER patent data loaded up in a database. I could add in patents and citations up to 2006 with a day or two of work. I am working on the 2011 update to the NBER patent data (see: http://www.nber.com/~edegan/w/index.php) but this will NOT be done before the March 7th deadline.

VC Reputations

We argue, explicitly, that VCs use their reputations to certify thier firms. We can calculate the defacto standard measures of reputation - the number of IPOs and the total number of successful exits, and use these to instrument our effects. This could be done for either the lead investor, or the most successful investor, or a weighted average of all investors (weighting by the number of rounds they participated in, or the proportional dollar value they may have provided). Likewise we can calculate the number of funds the lead investor had successfully raised at the time of the exit, or the average number of funds raised across all investors (again perhaps with a weighting).

VC Information Asymmetries

Implicit in our argument is that VCs mitigate the information asymmetries between themselves and their portfolio firms effectively. We can refine this argument to consider the degree to which a VC is likely to be informed about their porfolio firm.

Distances

We can use the road or great-circle distance from the lead investor to the portfolio company as a measure of the information acquisition cost. We could also create a cruder but likely more meaningful version of this by creating a binary variable to see whether the lead investor was within a 20-minute drive of the portfolio company (this is the so called '20 minute rule' - discussed as important for monitoring in Tian, 2006). Alternatively we could consider the nearest investor, or the average of the nearest investors across all rounds, etc.

I can get 2,500 requests per IP address (I can run 3+ concurrently from Berkeley) from the Google Maps api, with responses including driving distances and estimated driving times.

Active Monitoring

I can also determine whether the lead VC has a board seat at the portfolio company at the time of the acquisition, as well as the fraction of invested firms with board seats, and the total number of board sets held by VCs (or the fraction), using the identities of the executives. Though this will be particularly difficult in terms of data, I plan on doing it for another project with Toby Stuart anyway.

Rebuild Plan

I suggest that I leave the rebuild of the supplementary information asymmetry dataset (to show that that IT has greater information asymmetries than other sectors) until the end. It is a lot of work, both in terms of assembly time and run-time to do the regressions, and we can use the existing table for the next version if need be. I suspect that this component will take me 3 days on its own.

The regressions for the estimation window will have a run-time that might be considerable; even given the hardware that I have put together at Berkeley, I suspect that this will take at least 24hrs of compute time. I therefore plan on doing this very early and setting it running.

Proposed Rebuild Order

My order is therefore:

  1. Download, clean, and process the SDC data so that it can be joined to CRSP (and the other data sources)
  2. Download the CRSP data for the estimation and event windows. Set the estimation windows running. Build the event window code while they run, and otherwise move forward.
  3. Download the VentureXpert data. Pull the portco data first, so we can construct the binary indicator.
  4. Update the industry classification, using the old one, my new one, and VentureXpert as a reference set.
  5. Download an LBO dataset, so we can remove these firms.
  6. Download the COMPUSTAT data, and join it to the SDC data. At this point we should have everything we need to get the basic analysis up and running again.
  7. Build out the a full database of VC investments into these portcos so we can calculate distances, monitoring through board positions and reputations. Stop short of actually doing the build of these variables.
  8. Download the GNI IPO data to calculate the standard reputation measures and join it up, then calculate these measures.
  9. Calculate the distances for all VCs to all acquired targets. Determine lead VCs if feasible and calculate the distance measures.
  10. Add in the NBER patent data to 2006, include the number of patents and patents weighted by citations-received (not corrected for truncation)?
  11. Rebuild the "Information Asymmetry (IA) by Industry" data.

Time Estimates

My time estimates are going to be wild for three reasons: It is just really hard to estimate some of these things (the time goes into the things that you don't anticipate being a problem but are); some of my skills are rusty, and on the flip-side I now have some serious hardware to throw at this; and I'm currently recovering from some health problems. However, my best ballpark is:

  1. 1 day
  2. 2 days
  3. 1/2 day
  4. 1 day
  5. 1/2 day
  6. 1 day + 2 days to get everything together into a dataset for analysis
  7. 2 days
  8. 1 day
  9. 3 days
  10. 2 days
  11. 3 days

By the end of step 6, which I think will take 8 days, I should have a the original data rebuilt and analyzed again. To get steps 7-9, which would give us two good extensions to the data, would add another 6 days. The patent data extension (if wanted) would add another 2, and then the rebuild of IA data is guesstimated at another 3.

There are 16 calendar days between now and March 7th (excluding the 7th). I am going to lose 2 to a course that I'm taking, and 1 to health-care. That leaves 13, which is one short of the 8+6 for the 2 extensions. I will probably also need one or two days off (I just can't keep working 7 day weeks), but nevertheless, it looks like I should be able to complete the basic rebuild in time, and perhaps (if things go well) add an extension or two.

Rebuild Notes

Thoughts for discussion

  1. The experience variables (# Previous Acqs) are generated using the primary data, and will be truncated by the start of the dataset. We should probably consider year fixed effects to mitigate any induced bias.

Downloading the Acquisitions

Basic Criteria:

  • US Targets
  • Announced: 1/1/1980 to 12/31/2011
  • Target Nation: US
  • Acquiror Nation: US
  • Target Status: Private (V)
  • Acquirer Status: Public (P)
  • Percentage of Shares Owned after Transaction: 100 to 100 (will exclude those with missing data)

The completed deal flag is in Deal Status - this will be restricted to 'C' in the processing.

New Flags in SDC (downloaded for exclusions):

  • Bankruptcy Flag
  • Failed bank Flag
  • Leveraged Buyout Flag
  • Reverse LBO Flag
  • Spinoff Flag
  • Splitoff Flag
  • Target is a Leveraged Buyout Firm
  • And many others. These will be reviewed and excluded.

Founding year/Age of the Target was not available in the data. It is in VE for VC-backed only.

The street address is multiline and problematic if included. This can be drawn seperately if needed. We have the City, Zip and State, which is sufficient to get a Google Maps lookup. Likewise 'Competing Offer Flag (Y/N)', also known as COMPETE and Competing Bidder, is a multiline - with each presumably corresponding to a different bidder identity. It was excluded.

The NormalizeFixedWidth.pl script uses the spacing in the header to determine the column breaks. The EquityValue column has two spaces in front of its name that screws this. Both EquityValue and EnterpriseValue needed to be imported as varchar(10), as they have the code 'np' in some observations.

The NormalizeFixedWidth.pl script was modified so that it only drops commas in numbers and not those in names etc.

Processing the Acquisitions

A large number of 'new' flags are now available in SDC. Most of them have no bite on our data. But I have excluded the following:

  • Cases where the target was bankrupt or distressed as indicated by: TargetBankrupt, TargetBankInsolvent, Liquidation, Restructuring.
  • Cases where the form wasn't genuinely privately-held as indicated by: OpenMarketPurchases, GovOwnedInvolvement, JointVenture, Privatization (which capture government sales).
  • Cases where there was a share recap going on concurrently with the acquisition: Recap
  • Targets that had LBO involvement (more will likely be removed in the next phase of matching to LBO targets): LBO, SecondaryBuyoutFlag, ReverseTakeOver (used for LBO'd firms doing a reverse take over), IPOFlag (likewise).
  • Firms where the deal began as a rumor (so the information leakage is problematic): DealBeganAsRumor

All together, these constraints reduce us from 41,572 to 40,306 acquisitions. Constraints that had no bite included: Spinoff, TwoStepSpinOff, and Splitoff. Further restricting the data to completed transactions, and those with valid codes for when the transaction value was amended, reduces the data to 40,035.

Acquiror and Target names were keyed into unique names. The first acquisition of several was taken. 33 records had multiple entries, the correct one of which could not be determined. These were discarded.

Retrieving CUSIPs

The acquisitions data lists 6 digit CUSIPs, we need the 'correct' (the right issue for the right period) 8 or 9 digit CUSIP with which to search CRSP and COMPUSTAT. A full list of all CUSIPs was retrieved from COMPUSTAT for the period Jan 1978 - Jan 2012 using the annual data.