Changes

Jump to navigation Jump to search
This page is referenced in:
*[[VC Acquisitions Paper ]]
This page provides a summary details from the [[Information Asymmetry in Acquisitions Lit Review]] and the begining of the build notes for these variables.
==New Measures== The prior usage of these measures (except for distance and patents) is in the table in the next section ([[#Table of Measure Usage|Table of Measure Usage]]). The build notes for each of these measures are included in the [[#Build Notes|Build Notes]] section below.  '''Deal-level IA Measures:'''*iagoogledist - The distance between the target and acquirer HQ as calculated by Google Maps*iagoogledur - The driving time between the target and acquirer HQ as calculated by Google Maps*iagoogledur20m - Binary variable: Whether the target is within 20mins drive of the acquirer (the 20minute rule)*iagoogledur60m - Binary variable: Whether the target is within 1hr drive of the acquirer *iagoogledur120m - Binary variable: Whether the target is within 2hr drive of the acquirer *samestate - Whether the target and the acquirer are headquartered in the same state  '''Target-level IA Measures:'''*patents - Binary Variable: Whether the target has patents*patentcount - The count of patents held by the target  '''Industry-Year IA Measures (Industry defined at 3dg NAIC, year is transaction year):'''*From Compustat (Accounting Measures):**iacstatmktba - The Market-to-Book Value of Assets ratio**iacstatmktbe - The Market-to-Book Value of Assets ratio**iacstatepratio - Earnings-to-Price Ratio**iacstattotalassets - Total Assets**iacstatmkval - Matket value (Shares Outstanding x Price)**iacstatdevstage - Binary variable for Development Stage: Sales < 0.5b**iacstatrandd - R&D expenditure**iacstattotalrandd - Total R&D expenditure (includes R&D in process)**iacstatranddsaleratio - Ration of R&D to Sales**iacstatintan - Intangible assets on balance sheet**iacstatintanratio - Ratio of intangible to tangible assets*From CRSP (Stock Market Measures):**iacrspvolratio - Ratio of shares traded to shares outstanding**iacrsprmse - Idiosyncratic volatility (Root mean squared error) of stock returns (see below)*FROM IBES (Analyst measures):**iaibesrange - Range of forecasts**iaibesnumest - Number of estimates made by analysts**iaibesforecasterror - Forecast error (see below)**iaibesforecastsd - Std. Deviation of forecase error (see below)  The following variables are also provided in '''log form''' (note the 'l' on the end):*iagoogledistl*iagoogledurl*iacstattotalassetsl**iacstatmkvall**iacstatranddl*iacstattotalranddl*iacstatintanl*patentcountl 
==Table of Measure Usage==
*Target Age (not in SDC, so we would have to get another source...)
*R-Squared from Earnings, Book Value - this was used in a single paper and isn't worth it (it requires joining CRSP to COMPUSTAT before running the regressions)
 
===Rejecting for now...===
 
All of the capital structure measures:
*Sources include 'Thompson 13' (Tetlock 2010), 'Value Line Investment Survey' (Emery and Switzer 1999), 'Compact Disclosure' data base of 13f filings (Utama and Cready 1997), 'CDA/Investnet' (Aboody and Lev 2000), etc.
*It appears that Thomson 13, which was formerly CDA/Spectrum is now available through WRDS.
*Likewise, also under Thompson Reuters is the S12 data which details mutual fund holdings.
*There is also BlockHolders data in WRDS that covers 1,913 firms for the period 1996-2001. This is clean data from the paper by Gompers et al.
 
===Microstructure Measures===
 
We are '''not''' using Microstructure measures and don't plan to. But NYSE TAQ is available through WRDS to some subscribers (not me).
==Build Notes==
*Value-Weighted Return inc. distributions
Run the regressions on raw data (i.e., don,'t join to COMPUSTAT first). Specs:*Date Range: 1/1/78 -> 12/31/11*Company Codes: PERMNO*SEARCH ENTIRE DATABASE*ID Info: CUSIP, NAICS*Time Series: VOL (Share Volume, -99 is missing), RET (Holding Period Return, error codes -66.0 -77.0, -88.0, -99.0)*Share info: SHROUT (No. Shares Outstanding in K)*Mkt Info: VWRETD (Value-Weighted Return inc. Dists)*Output: tab-delimited txt*Date Format: YYMMDDn8 (corresponds to ISO 8601 and is Postgres compatible) RET errors:*E -44.0 No valid comparison for an excess return*D -55.0 No listing information*C -66.0 more than 10 periods between time t and the time of the preceding price t? *B -77.0 not trading on the current exchange at time t *A -88.0 no return, array index t not within range of BEGRET and ENDRET *'' -99.0 missing return due to missing price at time t  Get file:*e5da94cf0a760426.txt (3195.8 MB, 60127561 observations 8 variables) Plan: Pull into Postgres, Index, Cut into chunks (yearly? Want < 250mb?), Run regressions in STATA using a batch script with dispatch to Bear.
===Ratio of Shares Traded===
Data:
*Share Volume (VOL)
*No. of Shares Outstanding (SHROUTin K)
*And to take an average over the year for each firm.
From KS99: "the normalized forecast error, which is defined as the ratio of the forecast error in earnings to the earnings volatility of the firm. Earnings volatility is the standard deviation of the firm's detrended quarterly earnings in the five-year period before the announcement of the spin-off."
Data:*From drawn from I/B/E/S Detail Summary file pull:*Jan1978-Jan2012
*CUSIP (8Dg)
*Entire DB (US and International - both)
*EPS
*Fiscal Yr1
*Analyst CodeID: CUSIP, Company Name, OFTIC, TICKER*Other: Forecast Period End Date, Number of Estimates, Mean Estimate, Median Estimate , Standard Deviation, High Est., Low Est., Actual Value, US Firm*Sort: Cusip, Forecast period End*Actual Output: Tab delimited, None, YYMMDDn8 Get file:*b5191ac65df7b6c4.txt (197.3 MB, 2569173 observations 15 variables Match back to COMPUSTAT to get NAICS. See: http://wrds-web.wharton.upenn.edu/wrds/support/Data/_010Linking%20Databases/_000Linking%20IBES%20and%20CRSP%20Data.cfm ===Accounting Variables=== Data source: COMPUSTAT - North America Fundamentals Annual Ref vars:*Company Name*CUSIP*NAICS Variables:*Market-to-Book-Assets (or Q): MKVALT (Sup: Market Value Total) / AT (Bal: Assets Total)*Market-to-Book-Equity: CEQ MKVALT (Sup: Market ValueTotal) / (Bal: Common/Ordinary Equity Total), TEQ (Bal: Shareholder's Equity Total)*Earnings to Price Ratio: RE (Inc: Retained Earnings), EBIT (Inc: Earnings before Income Taxes), EPSPI (Inc: Earnings Per Share (Basic) Including Extraordinary Items), PRCC_F (Sup: Price Close - Annual - Fiscal)*Firm Size: AT, MKVALT (as above)*Development Stage (Sales<0.5b): SALE (Inc: Sales/Turnover (Net)) *R&D Expenditure (XRD: Inc: Research and Development Expense), RDIP (Inc: In Process R&D Expense)*Ratio of R&D to Sales: XRD/SALE (as above)*Intangible Assets: INTAN (Intangible Assets - Total)*Ratio of Intangible Assets INTAN/AT *Sales Growth: <math>\frac{SALE_t - SALE_{t-1}}{SALE_{t-1}}</math>*Common shares: CSHPRI - Common Shares Used to Calculate Earnings Per Share Basic Draw Notes:*Jan 1978 to Jul 2012*GVKey (note: No Permno anymore?)*No output on screening, otherwise all defaults*ID: Company Name, Ticker Symbol, CUSIP, Stock Exchange Code, Fiscal Year-End*ID Cont: NAICS*Desc: FYear*Bal: AT, CEQ, INTAN, RE*Inc: EBIT, EBITDA, EPSPI, RDIP, SALE, XRD*Misc: CSHPRI, TEQ*Sup: PRCC_F, MKVAL*tab delimited, no compression, YYMMDDn8 Get file:*0d1f67dfbaded51a.txt (50.0 MB, 347184 observations 24 variables) ===Target Characteristics=== We already have:*Method of Payment*Diversification/Related (i.e., Horiz, vert, cong)*Acquirer Experience*Patent Counts We need:*Distance btw Acquirer and Target*Citations Recd to patents ====Distance btw Acquirer and Target==== Addresses for both the Acquirer and the Target are available from SDC in the vast majority of cases.We will build a quick XML API to access: https://developers.google.com/maps/documentation/distancematrix/ We can pass 2,500 'elements' per IP address per day (24hrs) to this service, with a max of 100 elements per query and a max of 100 elements per 10 seconds. Note that URLs are restricted to 2048 characters, before URL encoding (particularly relevant if using multiple elements) and an element is an origin-destination pair. Returned data includes:*Road Distances (in meters/ft(?) in the value field, and km/miles in text)*Driving Times (in seconds in the value field, and hours in text) Geocoding is internal. We can explicitly use Google's geocoding to get longitude/latitude if we want to calculate Great Circle Distances (etc): https://developers.google.com/maps/documentation/geocoding/#XML Example query:http://maps.googleapis.com/maps/api/distancematrix/xml?origins=Bobcaygeon+ON|41.43206,-81.38992&destinations=Darling+Harbour+NSW+Australia&mode=driving&units=metric&sensor=false The sensible thing is probably to fire one element per query - as there is no gain to doing multiple elements.
Match back to COMPUSTAT to get NAICSUseful code references include:*[http://search.cpan.org/~gaas/libwww-perl-6.04/lib/LWP.pm#The_User_Agent LWP]*[http://search.cpan.org/~makamaka/JSON-2.53/lib/JSON.pm#decode_json JSON]
Anonymous user

Navigation menu