The Effect of Leverage Buyouts on Innovation

Jump to navigation Jump to search
Academic Paper
Title The Effect of Leverage Buyouts on Innovation
Author Ed Egan, Brian Ayash
RAs James Chen, Yimeng Tang
Status Working paper
©, 2016

See also: LBO Lit Review

Revision following Rejection from JF


Big Issues:

  • Reviewers didn't understand what we were doing. We need to rewrite and maybe clean up the methodology.
  • Go through Lerner paper again and recheck our understanding.
  • The version of the paper on overleaf has version 1 results and text. Did we ever write a version 2 to go with the updated data?

Rejection feedback is in:


The latest version is in:

Empirical approach:

  • Diff-in-diff vs. pair fixed effect


  • Brian tries to rewrite what we did empirically
  • Both read referee reports!
  • Ed can fix tone

Standard diff-in-diff references:

Review the version 2 results

This section describes how to review the version 2 results, with the expanded sample.

The files are in E:\projects\lbo

Begin by copying the following to July2020:

  • Masterv2-2.txt
  • LBOmatchscript.jl
  • LBOmatching.jl
  • MatchedMeasuresPortfolioV2-2.txt
  • CPI.dta

Change all produced versions to 2-3:

  • Log2-3.log
  • MatchInput2V3.txt
  • PreMergeAllV2-3.dta
  • PreMergeRelevantV2-3.dta
  • LBOAnalysisV2-3.log
  • matchresultsV2-3.dta
  • PatentStatisticsV2-3.dta
  • OldData.dta (used to load a marker to enable comparison with version 1)

Run to produce MatchInput2V3.txt (and other files). Then change LBOmatchscript to use MatchInput2V3.txt and run:

julia -i .\LBOmatchscript.jl

It throws a bunch of warnings about deprecated code, but seems fine. It produces matchresults.txt, which contains 347 lines. This is identical to matchresults.txt in the parent folder, so we don't need to update MatchedMeasuresPortfolioV2-2.txt. Fix the produced version numbers in and then run it!

Then review the main set of figures. To do this, reprocess FullTTests.xls (and maybe PairGoodTTtests.xls).

The results are in:

TablesAndFiguresV2-3.xlsx, which is in the Current Work subfolder on dropbox, as well as in E:\projects\lbo\July2020

(Note that these figures haven't been styled. Blue is control and Red is LBO throughout). The figures don't tell the same visual story as before, presumably because the updated data has many more recent LBOs, which are markedly different from the earlier LBOs. Time and industry effects likely mask the effect.

The main results are in DIDs.xml. These are briefly reprocessed in TablesAndFiguresV2-3.xlsx and just the Diff-in-Diff coefficient is reported below:

patentstockl patentflowl nograntsl nobuyl noselll noterml nonotnenewedl xrd17l
No Controls -0.110 -0.219*** -0.105** -0.0538*** 0.0114 0.0514* -0.0186 0.0971
(0.0851) (0.0586) (0.0462) (0.0198) (0.0231) (0.0269) (0.0240) (0.151)
Indu and Year FE -0.0973 -0.229*** -0.102** -0.0561*** 0.0141 0.0610** -0.0183 0.304**
(0.0819) (0.0576) (0.0454) (0.0198) (0.0234) (0.0258) (0.0236) (0.132)
Indu and Year FE (With T<10) -0.0319 -0.260*** -0.141*** -0.0490*** 0.00367 0.0543** -0.0121 0.118
(0.0751) (0.0544) (0.0425) (0.0179) (0.0211) (0.0252) (0.0229) (0.112)
Pair FE -0.110** -0.219*** -0.105*** -0.0538*** 0.0114 0.0514** -0.0186 0.276***
(0.0439) (0.0536) (0.0316) (0.0190) (0.0221) (0.0229) (0.0204) (0.0842)

So, despite the graphs, the main results are here! The diff-in-diff coeffs show that an LBO reduces patent flows, grants, and purchases, and increases term expirations. Only the results on R&D are unexpected, though the hadlbo coefficient is significant, negative and about the same magnitude as the diff-in-diff coeff for R&D, so this would suggest that PE firms target firms with low R&D and then increase it back to conventional levels. The pair fixed effects leave the coefficients unchanged (versus no controls) but reduce the standard error. We should check/prove that this is correct.

NOTE: Let's also put the logit and the treatment vs. control sample T-tests here!

Get Version 1 Running Again

The following files for version 1 were copied from the main dropbox folder (./LBO_Innovation_w_Egan) to a Version1 in E:\orojects\lbo:

  • masterv1-0.txt
  • PreMergeRelevant.dta
  • CPI.dta
  • matchresultsV7.dta
  • PatentStatistics.dta

I also copied over the 4 main compiled results files and the notes files:

  • All Results.xlsx (contains raw copies from source files plus partially processed graphs)
  • Graphs - Editted.xlsx (graphs with imperfect formatting but appears most recent)
  • Tables - Most Recent.xlsx (seems to be production tables)
  • Graphs - Most Recent.xlsx (graphs with imperfect formatting)
  • Ed's Notes.txt
  • Notes by Ed.xlsx (counts of LBOs in other papers, inc. Lerner)

I checked that Datasetup wouldn't overwrite any data files and renamed the log to lbov1-r2-3.log. Then I ran it. It didn't produce any excel sheets. Then I ran, without uncommenting the data build section. This built:

  • BeforeVsAfter.xls and .log
  • MainTTests.xls
  • MatchComparisons.xls
  • PropensityScore.xml and .log
  • Distribution.log
  • Checks.log
  • DIDs.xml
  • ProductionRegressions.log
  • Ttestsv6.log

Brian asked for a range on the Tobin's Q given in table 2. Table 2 should have been produced in, but the descriptive statistics section appeared missing. Possibly it was cut and paste out during the creation of Version 2. The LaTeX says that the Table was generated from the sheet Descriptives, which is in Tables-Most Recent. It appears to come from some tabstats but on data that had been adjusted for inflation with the CPI, which wasn't loaded until

Addressing Concerns

Beyond JF

Steve Kaplan

Steve said:

  • Seems consistent with Davis et al. for public-to-privates.
  • Would spend more time explaining how it differs from the recent Cumming et al paper. Results seem effectively the


So, Steve didn't read the paper at all. But I don't see the Davis cite in our reference, though I know Brian has written a comment on at least one of his LBO papers. The relevant cite is probably:

Davis, S. J., Haltiwanger, J., Handley, K., Jarmin, R., Lerner, J., & Miranda, J. (2014). Private equity, jobs, and productivity. American Economic Review, 104(12), 3956-90.

Of course, our paper is not consistent with Davis, at least concerning their findings on TFP gains. And we do differentiate our paper from Cummings. We could perhaps differentiate it further though.

Josh Lerner

Josh said:

  1. Certainly, our work on PE and jobs, productivity, etc. suggests that public-to-private deals are real outliers relative to other transactions along many dimensions. Would it make sense to look at a broader spectrum of deals? Another advantage is that it would substantially increase your sample size.
  2. Seagate is a real outlier. As you saw in our paper, we excluded it. How do you handle it? Or do you not include it in your sample?
  3. One issue we really struggled with was spin-off of entities around the time of IPOs, and differences in policies regarding the assignment of patents. How do you think about this issue?
  4. The industry categories you use seem quite broad. Thus, I would like to see more exploration of the matching procedure and the robustness of the results to different approaches.
  5. It would be helpful to talk more about the sources for the sample of transactions, to compare against other databases and the data-sets used in other papers (e.g., the Kaplan work, Kaplan-Stromberg JEP paper, and so forth).
  6. I could not quite square the numbers in Table 2 with the reported number of buyouts in the overall sample in the text. In general, being more careful on sample sizes, etc., and explaining the details would be helpful.
  7. Can you look at R&D expenditures after the buyouts for at least some firms (e.g., if they still make security filings)?


  • Josh's first and fourth comments appear somewhat disingenuous. We made a great play of how it was important to isolate public-to-private transactions and not mix them as he has done. Mixing provides the opposite results, which is good press for the industry... So let's ignore those comments.
  • We address an outlier firm without naming it on page 23 ("One firm held 426 patents...") and page 28 ("Our mean sell-off statistic includes, but is not driven by, a single leveraged buyout selling 798 patents – this is the same firm that went on to have 426 patents by the end of the year of its buyout, shown in Table 4."). Going through the data, this firm is OWEN-ILLINOIS INC, and the sale was of 767 patents, not 798. Seagate is in the original dataset as SEAGATE TECHNOLOGY PLC (gvkey=150937) and SEAGATE TECHNOLOGY-OLD (gvkey=9545). Neither of these records are marked as having had an LBO. We should look at the source data to see if there is some reason why Seagate isn't in our dataset.
  • If spinoffs were being assigned patents, and these assignments were recorded by the USPTO, we would see them as sales in our data. We may want to add a comment to this effect.
  • Yes, the industry categories are quite broad. We could spend eternity exploring the matching procedure, so let's ignore that one.
  • The comment on sample sizes and explaining details is fair. We need to be clearer about what the unit of observation is when we report results.
  • Yes, we can look at R&D filings and we did!

Heitor Almeida

"Very nice paper. If I remember correctly Lerner et al. did not have a control group, so the results were a bit questionable. it is nice to show that relative to a control group LBOs reduce innovation. Two quick reactions:"

  1. The obvious endogeneity concern is that for some reason private equity will target firms that are expected to reduce innovation (that is, innovation would have decreased irrespective of the LBO). it is impossible to completely address it, but it may be useful to discuss explicitly and perhaps try to look at whether innovation "predicts" LBOs. For example, perhaps firms with low R&D are more likely to be targeted, and this reduces innovation post-LBO.
  2. There is an alternative measure of innovation that looks at the market value of granted patents (Kogan et al., QJE). If it is the case that LBOs reduce the total value of innovation, that would add to your points.

He also said "I think RFS is not out of reach. The issue here as you know is that the authors of the existing paper may challenge your results if it is sent to them. Hopefully, the editor will send it to someone neutral." Which makes him practically clairvoyant, given what happened when we sent it to JF.

Addressing his concerns:

  • A paragraph on the many and varied endogeneity issues is a very good idea. We should have done this more properly.
  • The paper (in the dropbox folder) seems to use an event study approach to isolate the stock market reaction (in terms of market cap) to a patent issue. We can't do that - it's too much work. Also, almost all (85%+) of patent applications result in at least one issue, and the 'news' occurs when the patent application is made public (usually 18 months after filing), so that whole paper is pretty questionable. But then the team is outstanding and includes Amit Seru. They also solicited comments from the who's who of innovation economics.
Kogan, L., Papanikolaou, D., Seru, A., & Stoffman, N. (2017). Technological innovation, resource allocation, and growth. The Quarterly Journal of Economics, 132(2), 665-712.

JF Rejection

The paper was rejected by Amit Seru as an editor of the Journal of Finance (though he was nicer than the AE or the reviewers).

Amit Seru

Amit's comments were as follows (emphasis added):

Thank you for submitting your paper, MS 2019-0353, "The Effect of U.S. Public-to-Private Leveraged Buyouts on Innovation," for publication in the Journal of Finance. I have now received excellent feedback on your paper from two referees. The referee reports are attached to this email. Both referees recommend that I reject the paper. The AE also recommends a rejection. The main concerns of reviewing team center around the marginal contribution of the study and its empirical execution.

The AE summarized the sentiment of the reviewing team in the note to me as "I largely agree with the referees. I have concerns about the methodology (there is a lack of transparency in the way the paper is written) and even taking the results at face value I don't find the contribution large enough to publish the paper in JF":

  • The topic of innovation and LBOs is already well occupied by existing papers. To make a contribution vis a vis Lerner-Sorensen-Stromberg, we would have to think a bit deeper about the economic mechanism that links research and capital structure. Why would PE target destroy value ? In LSS, LBOs seem to be creating value through some form of disciplining effect and better innovation. But LSS have a different sample, mostly made of private to private deals.
  • Even if I take the results at face value, the decrease in patent stock mostly comes from the target selling patents right in the year when takeovers take place. This is the main result of the present paper. But is this value destroying ? We do not know. The million $ question in this literature is whether LBO targets really destroy value. But it's a hard one.
  • The methodology is opaque. Is it Diff in Diff or matching ? There doesn't seem to be pair fixed effects (so it's closer to DD, than close to the LSS methodology than the authors claim). Even though the authors never actually write down the econometric equation. In any case, using matching is not a big innovation in the literature, many papers already do it, and in a more transparent way. The true innovation in this literature would be to find an IV.
  • To make a paper on LBO publishable in JF, I think one of two things need to be done: (1) having a real source of variation or (2) providing clear and unambiguous proof that LBO target destroy value by being too short-termist. The paper unfortunately does none of the two (but these are both very hard problems, so I really don't blame the authors)
  • Like the referees, I would recommend that the authors rewrite the paper to make much more transparent. Settle on a sample (not removing data points every time you run a different regressions). Use it to replicate existing finding (LSS seems like a good template, perhaps others). Confirm existing results, and then slowly build from there.
  • Something I was confused about: How do the authors get information on post-LBO R&D spending? After the LBO the firm is not in compustat anymore. perhaps there is an explanation somewhere but I missed it.

I have also read your paper and found a lot to like about it. However, I was not able to convince myself to overrule the concerns raised by the reviewing team. I have therefore decided to reject your paper.

My notes:

  • The last comment from the AE is rather strange (and makes it unlikely that the AE was Josh). We discussed the R&D data fairly extensively.
  • The main weakness was the "opaqueness of the methodology". We need to fix this.
  • One instrument would be withdrawn LBOs, providing that they were withdrawn due to a change in market conditions, and not their innovation profile (C.f. Bernstein? -- his job market paper was on withdrawn IPOs and landed him a job at Stanford in around 2012).

Reviewer 1

Select comments from Reviewer 1 (uses terminology like 'forward citation' and doesn't understand 'synthetic controls'):

  1. The paper currently reads as if the main sales point is the methodology
  2. The exact method they used however is not clear. In the introduction they describe it as “difference-in-difference”, but in the text and particularly in the tables is hard to understand the empirical strategy (it doesn’t help that the equations are not written in the text).
  3. Instead, I would focus on what I think is the strength here, which is understanding more deeply how corporate innovation changes after LBOs, an in particular the distinction between in-house innovation and patent purchases, as well as the more strategic considerations of managing patent portfolios (expiration, renewals etc.)
  4. the authors argue that “…the entrepreneurial theory of LBOS stands alone in its positive connotations for innovation…” (pg. 12). Instead, I would argue that at least two other theories could predict a positive effect on innovation from LBOs. There is a long-standing hypothesis that LBOs (specially public to private—the focus here) can relieve managers from short‐term pressures from public shareholders. Similarly, LBOs can also alleviate agency issues between shareholders and managers, which can include the avoidance of innovation investments by CEOs with career concerns.
  5. the authors’ description of the “myopic based theory” is contradictory. I don’t see how they conclude that the treatment effect of an LBO would be to curtail innovation activities. One of their main points leading to the argument is that Bernstein (2015) shows lower innovation in public firms.
  6. The writing is at times confusing and contradictory. For example, in pg 13 it is not clear what the authors mean by “We also require each LBO to be matched to a control firm in the same industry, operating at approximately the same tie, and holding within 20% of the LBO target firm’s patent stock…” Do you mean to say that matched firms co-own 20% of the patents of the LBO target?

My Notes:

  • Agree with point 3. Let's de-emphasize the methodology and instead emphasize the corporate innovation decisions.
  • Clean up the description of the sample, etc.
  • Let's strip down and/or clean up the theories. They aren't the main selling point. We just need a value creation story and a value consumption story.

Reviewer 2

Select comments from Reviewer 2 (likely one of Lerner, Sorenson, or Stromberg):

  • This is a well-written paper that investigates an important question. While other papers have tackled this same question... there is room in the literature for a paper that revisits this question using rich new data and a careful identification strategy. The submitted paper does analyze richer patenting data than prior studies, and some of its descriptive findings are certainly interesting. However, I am afraid that the paper’s identification strategy and discussion of identification challenges falls well below the standards now expected by readers of the JF.
  • the authors do not appear to fully appreciate the endogeneity challenges they face in trying to identify a causal relationship between LBOs and innovation... private equity (PE) firms do not choose LBO candidates randomly. PE firms do extensive due diligence on their target firms... the concern remains that the matched firms likely differ from LBO firms on dimensions that the authors do not observe (and even some that they do observe) and these differences, and not the LBO itself, might drive the post- LBO differences in their investment activities.
  • To be sure, addressing this endogeneity challenge is a tall order, and there is no straightforward solution to it. But at the very least I would advise the authors to acknowledge the challenge and avoid overstatements such as “[w]e use a difference-in-differences approach to address issues of causality” (p. 3). Davis et al. (AER 2014) is a helpful example of how the endogeneity challenge can be discussed.
  • It would also be helpful if the authors could explore the sensitivity of their results to their matching choices
  • it would be helpful if the authors could provide a more thorough discussion of ... why are LBOs associated with a decrease in patenting?
  • the “myopia-based theory claim[ing] that the market is inefficient and does not correctly value long term investment projects” (p. 9) has often been cited as a reason why PE firms can generate value by taking firms private and thereby allowing them to focus on long-term investments
  • The authors appear to hold a more negative view of these cuts, seeing them as “worrisome” (p.

35) and thus implying that they harm the long-term value of the firms. However, it is not clear that cutting productive innovative investments will help PE firms increase the value at which they can sell their portfolio companies five years down the road. In particular, the so-called “non-productive balance sheet restructuring” that the authors discuss in page 11 would only be optimal if PE firms expect to sell their companies at a multiple of the book value of the companies’ assets

  • I thought the introduction is written in a very casual tone, with several non-obvious statements that are not backed by any references. Some examples are:
    • “[P]ublic-to-private leveraged buyouts remain particularly important and controversial.” (p. 1)
    • “The empirical literature also leads one to strongly suspect that public-to-private leveraged buyouts cause reduced innovation. An empirical study in the late 1980s and early 1990s established the stylized fact that increased leverage is associated with reduced R&D expenditures.” (p. 2)
    • “Our sample consists solely of material, innovative U.S. public-to-private buyouts that were leverage financed, and it is much bigger than previous samples of its kind.” (p. 3). I am not sure what the authors mean by “material” or what other previous samples they are comparing their sample to.
  • Also, I thought that the theoretical discussion in Section 2 of how innovative activities might be affected by LBOs is biased toward theories that view LBOs as negatively impacting innovation. In particular, I am not sure the statement in page 12 that “[t]his entrepreneurial theory of LBOs stands alone in its potential positive connotations for innovation” is correct. For instance, as discussed above, short-termism and other agency theories can predict that private firms will invest more in innovation than public ones
  • a) The authors write in p. 4: “Like Lerner et al. (2011), we find that patents held by innovative LBO firms receive more citations after their buyouts. However, we are able to attribute this solely to citation inflation: we find no statistically significant difference in citations received between LBO firm patents and control firm patents before or due to an LBO.” Lerner et al. (2011) are aware of such citation inflation, and they address it by scaling citations by the citations received by matching patents (pp. 459-460).
  • b) The sentence in p. 19, “In our sample, firms that experience a leveraged buyout are large, and those that have patents are about half as large again as those that do not” is confusing
  • c) The sentence in p. 23, “However, including either firms without patents in the year of the buyout, or firms with a large number of patents does drive our results” appears to be missing “not” (as in “does not drive…”).

My notes:

  • We aren't going to change the identification strategy but lets up the discussion of identification challenges and "acknowledge the [endogeneity] challenge[s] and avoid overstatements".
  • Let's remove the "non-obvious statements that are not backed by any references" and instead make it a list of references. Just substitute boring for concisely informative and we are good to go!
  • Lerner et al. did adjust for citation inflation. Either I misread it, or was looking at a different version of the paper.
  • Rather than using "material" let's just provide the asset cut-off again.
  • Add the 'not' in does NOT drive our results!

The HBS Working paper version of Lerner et al (in dropbox) on p16 says: "We observe a clear increase in the average number of citations for the patents granted to the private equity-backed firms. In part, this may reflect the increasing importance of patents in later years, but it may also reflect two other changes. As the pace of patenting world-wide accelerates, the frequency of patent citations has increased. Furthermore, as private equity investments in hightechnology industries become more common, the representation of patents in technologically dynamic industries has grown. Figure 3 captures these trends, and this figure shows a clear increase in the average number of citations, as well as the dispersion of citations, for the matching patents. Hence, it is important to control for the timing of the patent grant and its technology class. To address this concern, Table 2 also reports the relative citation counts. These are calculated as the number of citations in the calendar year of the grant and the three calendar years thereafter (citation count) less the average number of citations during this period to matching patents, which have the same grant year and primary USPTO class."

Version 2

How to do a run

Before you begin

All of the following assumes that you're working on Ed's infrastructure. If you're not, you'll need to set up a dbase server, and load up the lbo dbase, either by restoring it from a backup (see below) or by running Buyouts.sql and having the appropriate source files available (in the zip "").

To restore the backup to a postgres server

createdb lbo
pg_restore -Fc -d lbo lbo_Fc.dump

To build it from source files:

createdb lbo
psql lbo
\i Buyouts.sql

A run

Buyouts.sql produces Masterv2-2.txt. If you have this in hand, you'll need the following to do a run:

  • LBOmatchscript.jl and LBOmatching.jl
  • The lbo database loaded up on a postgres dbase server
  • PatentMeasures.sql
  • Some static STATA files: CPI.dta, OldData.dta.

The sequence is then:

1 Run

do Datasetup

2 Do the matching

julia -i LBOmatchscript.jl
 Note that this is going to throw lots of warnings about various methods being deprecated, but it works just fine. 
 MatchInput2V2.txt must be in the same folder as the script. Edit lines 61-66 to change the matching criteria. 
 Matching takes place at t-1!
 Also -i dumps you to an interactive prompt for debugging.
println("Number of LBOs with complete data available for matching ", size(LBOs,1))
 Should be 173
 Should be 0×9 DataFrame, but otherwise will show unmatched

3 Create the patent measures

psql lbo
 Note that you will need to open the connection from a directory containing matchresults.txt and PatentMeasures.sql
\i PatentMeasures.sql

4 Run the analysis

 Make sure the following files are all in the directory ready to go:
  matchresults.txt -- from 2
  MatchedMeasuresPortfolioV2-2.txt -- from 3
  PreMergeRelevantV2-2.dta -- from 1
  CPI.dta -- static file
  OldData -- static file

5 Wonder whether this was all worth it.

6 Edit the excel files (some are xml files that have to be opened from inside of Excel), optionally moving columns up to create alignment and using regexes to get rid of unwanted rows. Create and apply a chart template to simplify the standardized production of figures. Use the excel2latex plugin to export tables to LaTeX.

Build Notes

The Plan

The plan is to build a new database made from the following components:

  • LBO data with GVKEYs, preferably cleaned to deal with durations (etc.), imported directly from a spreadsheet
  • New draw of the universe COMPUSTAT data, with all appropriate variables, covering 1975-2017 (or 2018 if possible), processed to remove duplicates etc.
  • Imports of patenthistwoprebuys and gvkeyyearpatentdata, which are available in allpatentsprocessed
  • Import of sicnaicsconversion, as we need the lookup table(s)
  • Imports of patents, citations, and HistMasterFile, all of which are available in allpatentsprocessed -- we can then run the code from PatentMeasures.sql to generate the merge data once we've done the matching.
  • Also import assignees, even though we won't be using it directly

Note that citations_cleaned (74,164,171) appears to be just citations (97,680,838) restricted to integer patent numbers but we are going to rebuild it anyway.


The COMPUSTAT query will be (bold vars are those important to this paper):

  • Jan 1975 to July 2019 (latest allowed)
  • GVKEY, Entire Dbase
  • Fields (Note that it was dvt not dt):
    • Company Name, Ticker, CUSIP, Stock Exchange Code, state
    • naics, sic,
    • addzip, city, loc, prcc_f, ipodate, fyear
    • act, artfs, at, ceq, che, csho, dlc, dltt, dvt, dvc, dvp, ebitda, emp, esubc, ibc, intan, lct, lt, ni, oibdp, opiti, re, revt, sale, txditc, txt, xidoc, xint, xrd, xsga, xstfws

This was pulled twice, with OPITI included the second time (which also went to July 2019). The results are in e28f8123921e8343.txt which are saved to Z:\lbo. The request summary was saved to E:\projects\lbo.

Data Request ID	e28f8123921e8343
Libraries/Data Sets	compa/funda  /
Frequency/Date Range	ann / 01Jan1975 - 31Jul2019
Search Variable	GVKEY
Input Codes 
all item(s)	
Conditional Statements	n/a
Output format/Compression	tab /
Extra Variables and Parameters Selected	C  INDL FS  STD

LBO data

The LBO data is being drawn from a spreadsheet provided by Brian. The latest version is 484_LBOs_Outcomes_Master_Table.xlsx

Ed took the following columns:

dealnumb	fyear	gvkey	name	goprdt	extdt1	exttype1	sic

Other columns were excluded because they were either constant across records or derivative. In addition, one record, Aurora Electronics Inc, was excluded as it didn't have a GVKEY.

The GVKEYs have been through Excel, which treated them as numbers and omitted their leading zeros. They will have to loaded as text and then reprocessed as follows:

SELECT LPAD(gvkey::text, 6, '0') as gvkey

There are currently supposed to be 521 LBOs, and we have data on 483 (these are exported to Z:\lbo\LBOBase.txt). We are therefore missing data on 521-483=38. Their names and dealnums (including for Aurora Electronics Inc) into E:\projects\lbo\MissingLBOFirmNames.txt. The DISTINCT conm (names) and GVKEYs of all COMPUSTAT firms are in DistinctCSconm.txt (37759). These were matched using the Matcher. The results are in MissingLBOFirmNames.txt-DistinctCSconm.txt.matched, which is also a sheet called Matching in 484_LBOs_Outcomes_Master_Table.xlsx, as well as posted below.

File1:Coname          |File1:Dealnu|File2:conm               |File2:gvkey|Normalization
Baldwin Piano & Organ |          19|BALDWIN PIANO & ORGAN CO |      12656|Hall
Pamida Inc            |          70|PAMIDA INC               |       8306|Hall-Warning:Multiple
Pamida Inc            |          70|PAMIDA INC-DEL           |      14838|Hall-Warning:Multiple
Boys Markets Inc      |          71|BOYS MARKETS INC         |      13739|Hall
Budget Rent-A-Car Corp|          77|BUDGET RENT-A-CAR CORP   |      13394|Hall
Amphenol Corp         |          97|AMPHENOL CORP            |      14282|Hall
Eagle Food Centers    |         114|EAGLE FOOD CENTERS INC   |      15424|Hall
Grand Union Co        |         132|GRAND UNION CO           |      61094|Hall-Warning:Multiple
Grand Union Co        |         132|GRAND UNION CO  -OLD     |       5261|Hall-Warning:Multiple
Grand Union Co        |         208|GRAND UNION CO           |      61094|Hall-Warning:Multiple
Grand Union Co        |         208|GRAND UNION CO  -OLD     |       5261|Hall-Warning:Multiple
Advantage Health Corp |         225|ADVANTAGE HEALTH CORP    |      24927|Hall
Gruntal Financial     |         250|GRUNTAL FINANCIAL CORP   |      11710|Hall
Constar International |         294|CONSTAR INTERNATIONAL INC|       4049|Hall
Woodstream Corp       |         393|WOODSTREAM CORP          |      11578|Hall
Loehmanns Holdings Inc|         406|LOEHMANN'S HOLDINGS INC  |      15346|Hall
Collins Industries Inc|         491|COLLINS INDUSTRIES INC   |       3178|Hall
EMPIRE INC.           |         541|EMPIRE CO LTD  -CL A     |      14312|Hall
PANNILL KNITTING      |         551|PANNILL KNITTING CO INC  |      12325|Hall
GENERAL HOST CORP     |         565|GENERAL HOST CORP        |       5060|Hall
SBARRO INC            |         568|SBARRO INC               |       9441|Hall

There were 18 matches (counting both Grand Union Co), but Pamida matches to two GVKEYs and we need to work out which Grand Union Co is which. Both Pamida GVKEYs rightly belong to Pamida (they cover different time periods for the same firm), so the latter one was eliminated from the lbo dbase:

UPDATE COMPUSTAT SET gvkey='008306' WHERE GVKEY='014838';

Likewise, both Grand Unions appear to be the same firm, so their GVKEYs were merged:

UPDATE COMPUSTAT SET gvkey='061094' WHERE GVKEY='005261';

The GoPrvDate appears to be 1997 (from the spike in debt in the COMPUSTAT data), so dealnum 132 was kept (gopvt in 1998) and dealnum 208 was dropped (gopvt 1999).

There are some other issues with the data too. Two firms don't have gopvtdates, and so I can't create the last Fyear field for them or use them...

Dealnum  CoName
568      SBARRO INC

And several firms only have data way before or somewhat after their GoPvtDate:

dealnumb	fyear	gvkey	name	goprdt	extdt1	exttype1
250	1986	11710	Gruntal Financial	3/31/1997	4/15/2002	Strategic Buyer
393	1987	11578	Woodstream Corp	6/12/2003	4/15/2012	Still Private
19	1985	12656	Baldwin Piano & Organ Co	6/19/1984	7/15/1986	IPO
114	1988	15424	Eagle Food Centers	11/10/1987	2/15/2000	Bankruptcy

After this name-based matching process, there are 20 LBOs still unmatched. They are (also in StillMissingLBOs.txt in the dropbox and E:\lbo):

LBOName	DealNum	
--------       -------
AME Inc	196
American Electric Auto Co Inc	358
Anderson Tully Co	440
Becor Western Inc	94
Borg-Warner Corp	98
Camelot Music Holdings	234
Colt Industries Inc	140
Condec Corp	14
Consolidated Hydro Inc	165
EFTC Corp	342
Finl Institutions Ins Grp Ltd	242
Florida Steel Corp	166
Genesis Health Ventures Inc	318
International Controls Corp	100
Katz Capital Corp	237
Research Cottrell Inc	107
Steel Heddle Manufacturing Co	179

There are also 498 'good' LBO records, which are in E:\projects\lbo\LBOBase.txt, as well as the 'Export to Dbase' sheet of the spreadsheet. Note that GENERAL HOST CORP and SBARRO INC aren't currently included because we don't have LBO dates for them. Of the 498, 490 have CS data in t=0, t=-1 or t=-2. Extending this to t=-3 adds just a single LBO and so isn't worth it. Moving to just t=0 and t=-1 reduces the count to 434, which was a big loss.

Multiple LBOs

We can't use all of the LBOs because sometimes they are the same firm having more than one LBO. The following records have that issue:

 gvkey	count
013657	2
029909	2
013394	2
005976	2
009502	2
SELECT * FROM LBO WHERE GVKEY IN ('013657', '029909', '013394', '005976', '009502') ORDER BY GVKEY; 
dealnum	fyear	gvkey	name	gopvtdate	gopvtyear	exitdate	exityear	lbodurdays	lboduryears	exittype	exitcode	sic
281	1997	005976	Insilco Corp	1998-08-17	1998	2002-12-15	2002	1581	4.3315067	Bankruptcy	5	3678
178	1987	005976	Insilco Corp	1988-12-14	1988	1991-01-15	1991	762	2.0876713	Bankruptcy	5	3678
47	1984	009502	SCOA Industries Inc	1985-12-10	1985	2012-07-15	2012	9714	26.613699	Still Private	4	5331
555	1984	009502	SCOA	1985-12-01	1985	1997-03-01	1997	4108	11.254794	Financial Buyer	3	5331
77	1986	013394	Budget Rent-A-Car Corp	1986-09-30	1986	1987-04-15	1987	197	0.539726	IPO	1	
191	1988	013394	Budget Rent-A-Car Corp	1989-03-30	1989	1996-08-15	1996	2695	7.3835616	Strategic Buyer	2	7510
561	1996	013657	Alliance Imaging Inc acquired by Apollo	1997-07-15	1997	1999-11-03	1999	841	2.3041096	Financial Buyer	3 ...8071
185	1987	013657	Alliance Imaging Inc	1988-11-16	1988	1991-11-15	1991	1094	2.9972603	IPO	1	8071
347	1999	029909	Petco Animal Supplies Inc	2000-10-02	2000	2002-02-15	2002	501	1.3726027	IPO	1	5990
476	2005	029909	Petco Animal Supplies Inc_2	2006-10-26	2006	2012-07-15	2012	2089	5.7232876	Still Private	4	...5990

We are therefore going to drop the following dealnums, as they represent a second LBO of the same firm:

  • 281 Insilco Corp 1998-08-17
  • 47 SCOA Industries Inc 1985-12-10
  • 191 Budget Rent-A-Car Corp 1989-03-30
  • 561 Alliance Imaging Inc acquired by Apollo 1997-07-15
  • 476 Petco Animal Supplies Inc_2 2006-10-26

Data Issues

We are losing a lot of LBOs because of the exchange code restriction:

SELECT exchg, count(*) FROM (
    SELECT CSwNAICS.gvkey, max(exchg) as exchg FROM CSwNAICS
    JOIN LBOvalid ON LBOvalid.gvkey=CSwNAICS.gvkey

exchg	count
11	148  NYSE
14	142  NASDAQ
1	41   Priv (Can)
3	36   LBO!
0	35   Sub/Priv
12	35   Amex
19	24   Philly
13	22   OTC
7	1    Toronto
20	1    Unlisted Evaluated Equity

Our valid exchange codes are 11, 12, 14. The COMPUSTAT lookup table is below:

Stock Exchange Code
Variable Name = EXCHG
This item identifies the major exchange on which the company's Common/Ordinary Stock (Capital) is traded. Below are some of the common exchange codes. A complete list is available in the Compustat manual.

Code	Description	Exchange Country
0	Subsidiary/Private	Canada
0	Subsidiary/Private	United States
1	Non-traded Company or Security	Canada
1	Non-traded Company or Security	Germany
1	Non-traded Company or Security	Switzerland
1	Non-traded Company or Security	Taiwan
1	Non-traded Company or Security	United States
2	Consolidated Parent or Tracking Stock Company	United States
3	Leveraged Buyout	United States
4	Additional Company Record-PreSFAS 94, ProForma, PreAmended	United States
7	Toronto	Canada
8	Montreal Stock Exchange	Canada
9	Canadian Venture Exchange	Canada
10	Alberta Stock Exchange	Canada
11	New York Stock Exchange	United States
12	American Stock Exchange	United States
13	OTC Bulletin Board	United States
14	NASDAQ-NMS Stock Market	United States
15	Boston Stock Exchange	United States
16	Midwest Exchange (Chicago)	United States
17	Pacific Exchange	United States
18	Philadelphia Exchange	United States
19	Other-OTC	United States
20	Unlisted Evaluated Equity	United States

The problem seems to be that the exchange code is not temporal - it doesn't change over time, instead it is just a single fixed value for the entire COMPUSTAT record. Loc (i.e. country code) is also not temporal. Almost all of our LBOs are in the U.S.:

loc	count
USA	482

I am therefore going to restrict on being U.S. (which will cost us 3 LBOs) but not on exchange. In STATA we restrict on assets, and I'm going to trust Brian that these are all publicly traded firms undergoing LBOs. For the controls, we are just going to have to hope as well...

by gvkey: egen assetfilter = max(at >= 40 & at != .)

Variable Issues

In the initial version of the dataset there was a supposed COMPUSTAT variable named dt which was later labelled debt total in STATA. This variable appears to have been an error. It looks like DVT (dividends total) was pulled and accidentally labelled DT. There does not appear to be a DT variable in COMPUSTAT. Then in STATA, DT was used:

generate debtcapital:"Debt to capital" = dt / (at - lt)

I now load and label a DVT variable and calculate debt to capital as

generate debtcapital:"Debt to capital" = lt / (at - lt)


The new code is in:


Database is lbo and dbase server directory is /bulk/lbo

The COMPUSTAT query is saved on the mcnair (AmoDate) account, and the last result is saved as e28f8123921e8343.txt

The SQL code assembles data for all gvkey year combinations, so many observations have blank data. LBOValid restricts the LBOs to:

  • dealnum NOT IN (281,47,191,561,476)
  • CS data available one or two years before the LBO, and US HQ in CS. Note that the new CS data goes up to July 2019.
SELECT (gopvtyear-csfyear) as diff, count(*) FROM lbovalid GROUP BY (gopvtyear-csfyear);
 diff	count
 1	422
 2	56

The final data contains 478 LBOs, 180 of which have patents at some point in their lives.

The output of this SQL code is:


We then run, which (gvkeys, LBOs, LBOs w/patents ever):

  • Loads Masterv2-2.txt (83 vars, 1,294,171 obs): 30594, 478, 180
  • Restricts to declared assets >= 40m at some point: 17788, 472, 179
  • Requires non-LBO firms to be US non-OTC exchanges: 12131 (was 12106 with NNA exchanges), 472, 179

We find that Tobin's Q, lvg and revgrowth are heavily skewed so we log transform them or use a polynomial

gen tobinql=ln(tobinq)
gen revgrowthl=ln(1+revgrowth)
gen lvgsq=lvg*lvg

We also create markers:

  • LBOmarker, 1/0 for when the LBO takes place
  • LBOregmarker, 1/0 for the last year of CS data (-1 or -2)
  • Somepatenting, 1/0 for patent(s) at some point in history
  • matchfilter for whether the obs is in the match set (at -1)
  • regfilter is a legacy marker, for if you don't want to move accounting variables from t-2 to t-1.
  • savefilter for whether the obs will be saved for the analysis

The propensity score regression is:

. logit lbomatchmarker lvg tobinql revgrowth taxtoassets ebitdatoassets revtl if matchfilter==1, robust

Iteration 0:   log pseudolikelihood = -1027.1539  
Iteration 1:   log pseudolikelihood =  -1008.441  
Iteration 2:   log pseudolikelihood = -988.56148  
Iteration 3:   log pseudolikelihood = -987.85551  
Iteration 4:   log pseudolikelihood =  -986.7324  
Iteration 5:   log pseudolikelihood = -986.63042  
Iteration 6:   log pseudolikelihood = -986.62504  
Iteration 7:   log pseudolikelihood = -986.62501  

Logistic regression                             Number of obs     =     61,090
                                                Wald chi2(6)      =     129.38
                                                Prob > chi2       =     0.0000
Log pseudolikelihood = -986.62501               Pseudo R2         =     0.0395

               |               Robust
lbomatchmarker |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
           lvg |   .2638132    .039153     6.74   0.000     .1870748    .3405516
       tobinql |   -1.41471   .1566152    -9.03   0.000     -1.72167   -1.107749
     revgrowth |  -.0303849   .0741461    -0.41   0.682    -.1757085    .1149387
   taxtoassets |    2.43517   1.126206     2.16   0.031     .2278463    4.642494
ebitdatoassets |   3.148241   .7061438     4.46   0.000     1.764224    4.532257
         revtl |    .063724   .0270392     2.36   0.018     .0107282    .1167198
         _cons |  -6.485486   .2214917   -29.28   0.000    -6.919602    -6.05137
Note: 46 failures and 0 successes completely determined.

With the increased sample, we now find that LBOs are associated with reduced likelihood of patenting overall. However, this results is driven by the LBOs in period 1 from 1980 to 1995. In period 2, from 1995 to 2015, there is no statistically significant association. We also tried using decades but this wasn't instructive.

The matching scripts, LBOmatchingscript.jl and LBOmatching.jl, were rewritten by Ed to run with Julia 1.1.1. They use the LBOmatchmarker and matchfilter -- we are matching at t-1 (obs with logitpreg only at t-2 have it, and the variables used to create it, copied from t-2 to t-1). Note that we only export if matchfilter==1 to MatchInput2V1.txt. All 179 patenting LBOs are in there, however, not all of them have a logitpreg:

. count if hadlbo==1 & regfilter==1
. count if hadlbo==1 & regfilter==1 & n2 <.
. count if hadlbo==1 & regfilter==1 & logitpreg <. & n2 <.

This is because variables just aren't available to calculate them. This was explored and is the best it reasonably can be after applying some manual fixes (see below).

su tobinql if regfilter & hadlbo //175
su revgrowthl if regfilter & hadlbo //176
su ebitdatoassets if regfilter & hadlbo //179
 ...all other vars have full coverage

Note that produces the PatentingLikelihood.xml (and .txt) files and analysis. There is time variation in the results now.

We then match at t-1 using contemporary variables, inluding patentstock. To run the matching script:

julia -i LBOmatchscript.jl
 dumps you to an interactive shell afterwards to do diagnostics:
  println("Number of LBOs with complete data available for matching ", size(LBOs,1))

We also adjusted the bounds on patenting to 25% (and rounded the result to an integer) in the match script(s). This gave 173 matches from 173 candidates (i.e. we caught one unmatched at the 20% bound). See below for more information about catching the one unmatched. We also adjusted the script so that it just outputs the matches and throws out everything else. This can be uncommented if needed.

Now import the matches into the dbase and generate the patent statistics using PatentMeasures.sql. Note that it is not possible to generate the patent measures for all of Compustat -- the data is just too big.

In PatentMeasures.sql

  • We are using all citations, not citations_cleaned, as the all cited and citing are bigint in citations. This means we have higher counts this time around.
  • Originality is based on citations received and generality is based on citations made. There may have been an error in originality in the previous version (hard to tell - misnamed tables)
  • The Match5yrCites table is big: 170,169,703 obs -- it is the explorative/exploitive measures that are very hard to build. The citation measures could likely be built for every CS firm 1974-present.
  • Ultimately creates: MatchedMeasuresPortfolioV2-2.txt

Make sure that there are copies of PatentMeasures.sql and matchresults.txt in Z:/lbo. Then. to run the SQL script, ssh on to mother ( as researcher, and:

cd /bulk/lbo
psql lbo
\i PatentMeasures.sql

Then run The current versions of the datasets with their origins are:

  • -> MatchInput2V2.txt -> LBOmatchscript.jl -> matchresults.txt -> matchresultsV2-2.dta
  • -> MatchInput2V2.txt -> LBOmatchscript.jl -> matchresults.txt -> PatentMeasures.sql (on dbase lbo)-> MatchedMeasuresPortfolioV2-2.txt -> PatentStatisticsV2-2.dta
  • -> PreMergeRelevantV2-2.dta
  • CPI.dta
  • OldData.dta was updated to take into account that the matches are made at t-1. Its preparatory code was also entirely rewritten. Markers for LBOs included in the old data are loaded (as olddata and olddataflag, at the pair level) for reference. None of the old matches are refound in the new data -- these are entirely new matches, likely as the logit specification was changed.

The output of is:

  • Checks.log
  • MainTTexts.xls
  • DIDs.xml
  • BeforeVsAfter.xls

A quick build of the graphs is in MainTTests-Reprocessed.xlsx. Making a chart template is a good idea [1] but it won't let you save everything you'd like to do (especially it won't put on the error bars automatically). Ed's chart template was saved in C:\Users\ed\AppData\Roaming\Microsoft\Templates\Charts\LBOPaper.crtx. From what I recall the spec should be:

  • Times New Roman in a large font size (12pt?)
  • Black and white, with dash lines for controls and low line thickness (1pt for graph, 0.5pt for error bars)
  • Axis title(s) but no title. Legend to say control or LBO on graph itself to reduce footprint.

Catching the one unmatched

173 in, 1 unmatched:

1×9 DataFrame
│ Row │ gvkey │ year  │ lboregmarker │ patentstock │ indu3 │ logitpreg  │ hadlbo │ regfilter │ matchpair │
│     │ Int64 │ Int64 │ Int64        │ Int64       │ Int64 │ Float64    │ Int64  │ Int64     │ Float64   │
│ 1   │ 8367  │ 1983  │ 1            │ 27          │ 3     │ 0.00564947 │ 1      │ 1         │ -0.1      │

This is a firm that went private in 1985 and stayed that way (still private at exit year 2012). We are currently matching +/-1 2yrs around the reg date (i.e., -1 or -2 yrs before).

In regfilter=1, indu3=3, year>=1981 & <=1985, there are 923 observations (multiple per firm), of which 2 are LBOs:

gvkey	year	lboregmarker	patentstock	indu3	logitpreg	hadlbo	regfilter
7298	1983	1	0	3	0.002207293	1	1
8367	1983	1	27	3	0.005649468	1	1

But the nearest patent counts are 17,18,19,20,21...34,35. There are some close logitpregs for those counts:

gvkey	year	lboregmarker	patentstock	indu3	logitpreg	hadlbo	regfilter
4127	1983	0	18	3	0.006133826	0	1
4127	1984	0	20	3	0.006820232	0	1
4127	1985	0	21	3	0.008283807	0	1
4452	1981	0	35	3	0.007059322	0	1
4452	1982	0	40	3	0.00471063	0	1
5137	1981	0	34	3	0.004571791	0	1
10859	1981	0	17	3	0.008149391	0	1
10859	1982	0	17	3	0.010007644	0	1
10859	1983	0	17	3	0.006547268	0	1
10859	1984	0	19	3	0.005754435	0	1
10859	1985	0	19	3	0.005588403	0	1

We can catch several obs if we move to 25% bounds (especially if we round up/down to integers) on patenting. After doing this we have 173 matches!

Notes on manual fixes

First run matches 162 (163 but no match 66). Although the 179 are going in, some are missing indu3 (because they are missing n2) and some are missing logitpreg.

. count if hadlbo==1 & regfilter==1
. count if hadlbo==1 & regfilter==1 & logitpreg <.
. count if hadlbo==1 & regfilter==1 & logitpreg <. & indu3 <.
. count if hadlbo==1 & regfilter==1 & logitpreg <. & n2 <.

One solution was to create an additional sic to naics lookup and then union it with the official one. Because of the courseness of the newly included SIC and so NAICS codes, this will only work when using 2 digit NAICS codes. This got us to 171 in and 170 matched.

julia> diagdf
1×9 DataFrame
│ Row │ gvkey │ year  │ lboregmarker │ patentstock │ indu3 │ logitpreg  │ hadlbo │ regfilter │ matchpair │
│     │ Int64 │ Int64 │ Int64        │ Int64       │ Int64 │ Float64    │ Int64  │ Int64     │ Float64   │
│ 1   │ 8367  │ 1983  │ 1            │ 27          │ 3     │ 0.00616416 │ 1      │ 1         │ -0.1      │

//Diagnostics on logitpreg

su gvkey if regfilter & hadlbo //179
su lvg if regfilter & hadlbo //177
su lvgsq if regfilter & hadlbo //177
su tobinql if regfilter & hadlbo //172
su revgrowthl if regfilter & hadlbo //174
su taxtoassets if regfilter & hadlbo //177
su ebitdatoassets if regfilter & hadlbo //176
su revtl if regfilter & hadlbo //177

revgrowthl first...

tab gvkey if regfilter & hadlbo & revgrowthl==.
  Compustat |
      gvkey |      Freq.     Percent        Cum.
       3051 |          1       20.00       20.00  FIXED -1 is 1992, which has no vals. -2 has vals.
      14276 |          1       20.00       40.00  1987,-2 and has data for just that year. Would need revt for 1986.
      14628 |          1       20.00       60.00  1987,-1 and no data for -2. Could use 0 but some leverage step up there already!
      15309 |          1       20.00       80.00  1988,-1 and had datga for just that year. Would need revt for 1987.
      25389 |          1       20.00      100.00  FIXED -1 is 1995, which has no vals. -2 has vals. 
      Total |          5      100.00

The two marked fixed where fixed in the data:

UPDATE LBOvalid SET CSfyear = 1991 WHERE gvkey='003051';
UPDATE LBOvalid SET CSfyear = 1994 WHERE gvkey='025389';

Next go after Tobin's q. The definition used is:

generate tobinq:"Tobin's Q" = (csho*prcc_f)+lt/at
. tab gvkey if regfilter & hadlbo & tobinq==. & gvkey!=3051 & gvkey!=25389
  Compustat |
      gvkey |      Freq.     Percent        Cum.
       6005 |          1       20.00       20.00 1994 (1992), missing prcc_f 
      10160 |          1       20.00       40.00 1990 (1989), missing prcc_f 
      14628 |          1       20.00       60.00 1988 (1987), missing prcc_f but no prev year anyway... 
      15309 |          1       20.00       80.00 No prev year...
     118321 |          1       20.00      100.00 2007(2005), missing csho. Could use previous year's of 42.266..
      Total |          5      100.00

Ignoring the two with no previous year...

gvkey	year	incs	tic	conm	at	csho	lt	exchg	prcc_f	gopvtyear	exityear	csfyear	hadlbo
6005	1992	1	3IGRR	INTEGRATED RESOURCES INC	1338.19	8.929	2429.61	13		1994	1996	1992	1
10160	1989	1	AG1	SUNBEAM CORP	785.105	.001	444.262	0		1990	1992	1989	1
118321	2005	1	SRNA.1	SERENA SOFTWARE INC	671.61		369.614	14	23.61	2006	2012	2005	1	1.229053

Found SEC order[2] for Sunbeam for artificially inflating its stock price: "From the last quarter of 1996 until June 1998, Sunbeam Corporation's senior management created the illusion of a successful restructuring of Sunbeam in order to inflate its stock price and thus improve its value as an acquisition target..."

Couldn't find prices for either Sunbeam or Intergrated, so used the previous year's csho for Serena and moved on.

UPDATE CS SET csho=42.266 WHERE gvkey='118321' AND fyear=2005;

Review of Old Build

This section was written as review of the old build to assist with the current build. Readers can skip it unless they are interested in legacy construction.

First things

Restore allpatentsprocessed (as researcher):

cd /bulk/mcnair/backups
createdb allpatentsprocessed
pg_restore -Fc -d allpatentsprocessed allpatentsprocessed_Fc_230818.dump

It looks like DataExtension.sql in E:\mcnair\Projects\LBO\Clean\Ed Run used usptopad on the old rdp to make tables for all of COMPUSTAT and import them into allpatentsprocessed. usptopad was not archived in the Big Move -- we might want to back it (and other rdp dbases) up!

It also looks like some development notes are missing. I can't find them in either E:\mcnair\Projects\LBO\Clean\Ed Run or the LBO dropbox, or Dropbox\coauthoredprojects\NonMcNair.

However, loads/merges the following:

  • PreMergeRelevant.dta, which is produced by, after a keep if savefilter. The sole input to is masterv1-0.txt
  • matchresultsV7, which comes from matchresults - V7 nofe lagps 92 R0 indu3 20pc 5yr.txt, which is generated by running LBOmatchscript.jl on MatchInputV7.txt, which in turn is made by
  • PatentStatistics.dta, which needs to be produced using PatentMeasures.sql on allpatentsprocessed once all matches have been identified
  • CPI.dta
export delimited gvkey year lboentry lboexit patentflow patentstock lagpatentstock indu3 indu5 indu6 decade logitpreg logitpregwindu3fe logitpregwfe 
hadlbo matchfilter somepatenting using "E:\McNair\Projects\LBO\Clean\Ed Run\MatchInputV7.txt", delimiter(tab) nolabel quote replace


PatentMeasures.sql takes MatchedGVKeys.txt (184) and builds the patent measures (see below) in allpatentsprocessed, producing MatchedMeasuresPortfolio.txt (8468), which is merged in after being resaved as PatentStatistics.dta.

DROP TABLE MatchedMeasuresPortfolio;
CREATE TABLE MatchedMeasuresPortfolio AS 
SELECT gvkeyint as gvkey, year, avg(exploitive) as avgexploitive, 
avg(Explorative70) as avgexplorative70, avg(Explorative80) as avgexplorative80, avg(Explorative90) as avgexplorative90,
avg(gen) as avggen, avg(genunbiased) as avggenub, avg(org) as avgorg, avg(orgunbiased) as avgorgub,
avg(citesmade) as avgcitesmade, avg(claims) as avgclaims
FROM MatchedMeasuresBase
GROUP BY gvkeyint, year
ORDER BY gvkeyint, year;

Required input tables are:

  • HistMasterFile, which is the NBER historical masterfile FROM 'historical_masterfile.csv'
  • citations
  • patents

All three are in allpatentsprocessed


The patentexpiry table was built in allpatentsprocessed by running PatentExpiry.sql, which is in E:\mcnair\Projects\LBO\Clean\Ed Run and was last updated in March 2018. Most of the script is comment. The key elements are:

The stack is as follows:

  • patents is an input
  • Maintfeeevents2 FROM 'MaintFeeEvents_20170410b.txt' (COUNT 14708262)
  • PatentExpirationRules2 FROM 'patentExpirationRules2.txt' (COUNT 27)
  • Maintfeeevents2,patentexpirationrules -> patentfeebaseevents (4402215)
  • FULL OUTER (patents, patentfeebaseevents) -> patentfeebasefull (5463947) -> PatentExpiryBase (5463947) -> PatentExpiry (5463947)

Note that patentfeebaseevents is made from PatentExpirationRules and not PatentExpirationRules2. Both list the same fee codes and description, but PatentExpirationRules2 has a termlength column whereas PatentExpirationRules has daysincrement and daysduration columns. daysduration is the same as termlength, except when daysduration=7300 then termlength=6205. It seems from the code that PatentExpirationRules is correct, as there is a manual fix when max(daysduration)=7300.

DROP TABLE PatentExpiry;
SELECT  patentnumber, FilingDate, grantdate, maxdaysduration, nomaintenance, maxtermfix1, maxtermfix2,
	--Old patents get: grant plus 17yrs
	WHEN nomaintenance = 1 THEN grantdate + 6205
		-- Patents grandfathered in during transition in 1995 to new file date + 20 year term limit get: Later of issue date + 17 years or file date + 20 years used
		WHEN (maxtermfix1 = 1 AND maxdaysduration = 7300) 
		THEN GREATEST(FilingDate + 7300, grantdate + 6205)
			--Patents after June 8 1995 but before May 28 2000 get: file date + 20 years
			WHEN (maxtermfix2 = 1 AND maxdaysduration = 7300) 
			THEN FilingDate + 7300
				--Patents after May 28 2000 (this might not be automatic and require an appeal though) get: later of issue date + 17 years or file date + 20 years 
				WHEN (maxtermfix2 = 2 AND maxdaysduration = 7300)
				THEN GREATEST(FilingDate + 7300, grantdate + 6205)
	ELSE grantdate + maxdaysduration
	END AS expdate,
	CASE WHEN nomaintenance = 1 OR maxdaysduration = 7300 THEN 1::int ELSE 0::int END AS HeldToTerm
FROM PatentExpiryBase;

Note: It also looks like Ed did the same fixes in: E:\mcnair\Projects\LBO\Clean\Ed Run\Building the Patent Start End Table Revised.sql

Analyzing DataExtension.sql

DataExtension.sql was last saved in July 2018. The output of this script is masterv1-0.txt (last produced in March 2018), which is the sole input to This section therefore traces the construction of masterv1-0.txt.

Key stacks:

  • StataStaging4 is an input
  • gvkeyyearpatentdata (395213)
  • statastaging4, (statastaging4->) LBOSeriesProcessed (556) -> LBODetailsWDurr (1691834)
  • StataStaging4 -> NaicsSICBase (36779) & ExchangeOK (16236)
  • (StataStaging4,LBODetailsWDurr,gvkeyyearpatentdata,NaicsSICBase | ExchangeOK) -> Master (746856)

gvkeyyearpatentdata appears to have been built for all of COMPUSTAT:

--# Build yearslist, GvkeyYear

DROP TABLE yearslist;
CREATE TABLE yearslist AS 
SELECT generate_series(1975,2017) as year;

DROP TABLE Allgvkeys;
SELECT DISTINCT gvkey FROM patenthistorybase ORDER BY gvkey;

SELECT gvkey, year
FROM yearslist, Allgvkeys
ORDER BY gvkey, year;

Key stacks:

  • patentexpiry is an input
  • cspatentbuys FROM 'cspatentbuys.txt' (304864), which came from usptopad
  • cspatentsales FROM 'cspatentsales.txt' (451730), which came from usptopad
  • stdcompnames is an input
  • assignees (is an input) -> Patentsassigneessimple 5426556
  • (Patentsassigneessimple,MatchOrgnameSelf (loaded from orgnamedctmatched.txt 501232), MatchstdcompnamedctOrgname (loaded from Matchstdcompnamedctorgnamestd.txt 8361,stdcompnames) -> patentgrantgvkey
  • patentgrantgvkey (1329006), cspatentbuysutlgvkey (272620) ... -> patentgrantbuy
  • (patentgrantbuy,patentsale,patentexpired)-> patenthistorybase (1513045)
  • patenthistorybase ...,GvkeyYear -> gvkeyyearpatentdata (395213)

Therefore needed inputs to DataExtension.sql are:

  • patentexpiry
  • stdcompnames
  • assignees
  • StataStaging4

Note that Patentsassigneessimple is not in allpatentsprocessed but assignee is, and it can easy be rebuilt:

SELECT COUNT(*) FROM assignees WHERE orgname IS NOT NULL AND patentnumber IS NOT NULL;

There are two obvious master tables that we could export and reuse. The first is patenthistwoprebuys. This is patenthistorybase (1513045) but WHERE startdate < enddate. It has 1,495,486 records and the unit of observation is a patent associated with a firm. The keys are gvkey and patentnumber.

                Table "public.patenthistwoprebuys"
    Column    |       Type       | Collation | Nullable | Default
 gvkey        | text             |           |          |
 patentnumber | integer          |           |          |
 startdate    | date             |           |          |
 startyear    | double precision |           |          |
 grants       | integer          |           |          |
 buy          | integer          |           |          |
 enddate      | date             |           |          |
 endyear      | double precision |           |          |
 sell         | integer          |           |          |
 term         | integer          |           |          |
 notrenewed   | integer          |           |          |
 unknownexp   | integer          |           |          |

The second master table is gvkeyyearpatentdata. This is an aggregated table, where the unit of observation is gvkey-year, where year goes from 1975-2017 (see GvkeyYear above) and contains zeros. There are 395,213 observations (i.e., one for each year for 9,191 unique GVKEYs).

              Table "public.gvkeyyearpatentdata"
       Column       |  Type   | Collation | Nullable | Default
 gvkey              | text    |           |          |
 year               | integer |           |          |
 nogrants           | bigint  |           |          |
 nobuy              | bigint  |           |          |
 nosell             | bigint  |           |          |
 noterm             | bigint  |           |          |
 nonotnenewed       | bigint  |           |          |
 nounknownexp       | bigint  |           |          |
 portfoliochg       | bigint  |           |          |
 portfolio          | numeric |           |          |
 portfoliocitations | bigint  |           |          |
 citesperpatent     | numeric |           |          |


So the dataset is built from StataStaging4, as well as raw tables.

Variables used include:

  • gvkey, year
  • lboentry,lboexit
  • private,ipodate
  • naics, sic, exchg
  • datadate, fyear, tic, cusip, name_compustat, fyr, act, artfs, at, ceq, che, csho, dlc, dltt, dt, dvc, dvp, ebitda, emp, esubc, ibc, intan, lct, lt, ni, oibdp, opiti, re, revt, sale, txditc, txt, xidoc, xint, xrd, xsga, xstfws,
  • exchg, prcc_f, addzip, city, loc

It therefore seems that statastaging4 is all of COMPUSTAT with some LBO data included... However, I can't find the code that actually built it. There is code to build StataStaging5 in

E:\mcnair\Projects\LBO\Clean\Ed Run\LBO Innovation SQL Script.sql

And the best instructions seem to be to run parts 1, 2, and 3A of that script.

Version 1

Files are in E:\McNair\Projects\LBO\Clean\Ed Run

The main STATA scripts:


The AllResults.xlxs file is the current results set.

Notes from the call:

  • We have 92 LBOs that get matched.
    • Have one or more patents in your lifetime (for both LBOs and non-LBOs)
    • We are matching on a logit without fe. Can match using indu3. Can't use indu5 or indu6.
    • Matches require same indu3, 20pc of the patentstock year0, 5yr window (-2 to +2)

Ed has to fix matches so that they are on patentstock in year -1.

Once Ed has the FINAL dataset built, he can go back and build the following patent statistics:

  • Originality
  • Generality
  • Claims made
  • Citations made

Also consider (from the Manso paper):

  • Exploratory patents (within-firm vs. not paper)

Find out what Lerner did for before vs. after and mimic

Data Processing notes

The process:

  1. In dbase allpatentsprocessed on 181. Local files in Z:\LBO\Clean
    1. Run LBO Innovation SQL Script.sql sections 1,2, and 3A
    2. Run Building the Patent Start End Table Revised.sql
    3. Run LBO Innovation SQL Script.sql sections 3B, 3C, and 4. This produces statastaging5.txt.
  2. In STATA
    1. Run It takes statastaging5.txt and produces STATAdataset5wvars.dta
    2. Run It takes STATAdataset5wvars.dta and produces STATApredictLBOclean.dta and STATApredictLBOclean5.txt
  3. In Julia
    1. Run LBOmatchscript.jl. It takes STATApredictLBOclean5.txt and produces matchresults.txt
  4. In STATA
    1. Run It tales matchresults.txt and produces MatchResults.dta
    2. Run It takes STATApredictLBOclean.dta and merges it with MatchResults.dta. It then does the analysis.

See also: James New Run Documentation.txt This provides an overall summary, though some of the details are a little off.

Adding Patent assignment data

The patent assignement data is loaded into usptopad on the rdp (not the dbase server). The load script and processing script are in:

E:\McNair\Projects\USPTO Patent Assignment Dataset

The process script (ProcessAssignments.sql) was modified for this project:

  • Create list of distinct or_name (from assignors) for inter firm assignments
  • Create list of distinct ee_name (from assignees) for inter firm assignments

Build the patent data:

  • DataExtension.sql

New Version

Matching LBOs (Julia)

Working Notes

James' rough notes are in E:\McNair\Projects\LBO\Working Notes. These are date ordered and more for James than anyone else.

Files are generally in


Separate code to build the patent stocks base type (start date, expiry date) is in:

Building the Patent Start End Table.sql

Note that this code is out of date and needs to be replaced.

The dbase is allpatentsprocessed

LBO Innovation SQL Script.sql is the sql script to process the data

  • Has comments on matcher use
  • Note - Can't run this file!
  • Bbase txt files (in and out) in Z:\LBO\Clean
  • statastaging4.txt (current final dataset) is exported from allpatentsprocessed

To work with the data in STATA:

  • Run
    • Takes statastaging4.txt (currently as dta)
    • Makes that statastaging4.dta
  • Run
    • Takes statastaging4.dta
    • Makes STATApredictLBOcleanb.dta (which has a bug fix over STATApredictLBOclean.dta)
    • Also exports STATApredictLBOclean.txt, which contains just the variables relevant for the matching

The matching scripts (in Julia - run as admin to access modules) are in folder

E:\McNair\Projects\LBO\New matching 

Note that the instructions below are for the old version. James has now fixed the matching script to export tab-delimited text of matches that can be merged back into a STATA dataset, so the diff-in-diff can be done in STATA.

The documentation for the new version can be found at Matching_LBOs_(Julia)

The old version instructions are:

  • From terminal in E:\McNair\Projects\LBO\New matching
  • Launch Julia: julia
  • Run script: include("LBODID.jl")

The old files are

 Loads data from E:\McNair\Projects\LBO\Clean\STATApredictLBOclean.txt
 Adds col, removes dead cases, breaks data into LBO & non-LBO
 Line 55: mscore = :logitpw;
 Lines 89-96: Define criteria for what constitutes a match - store as string evaluated for truth
 Line 101: include("LBOmatching.jl") #Loading in user-defined methods for matching
 Line 104: Do the matching!
 Beyond line 105 are doing the diff-in-diff


This is a clean reboot of the LBO project with clearer documentation. The purpose of the project is to analyze the innovative activity of firms after they undergo LBOs. For counterfactuals in the diff-in-diff framework, we match LBO firms to non-LBO using decade, industry, patenting history and propensity score.

See the Construction of Dataset section for the steps taken in SQL to build the dataset.

Key Tables

In E:\McNair\Projects\LBO:

  • LBO_list_and_data.xlsx: Dataset from Ayash and Schuett (2016) (Hand-cleaned list of LBOs from SDC)
    • first sheet = main dataset
    • second sheet = raw SDC data, dealnumb to GVkey linkage (all public LBOS, including divisional buyouts)
    • third sheet = COMPUSTAT lookup (unused)
    • fourth sheet = Private-to-private transactions
  • failedLBOs2.xlsx: list of withdrawn LBOs from SDC
    • first sheet = data
    • second sheet = query specifications

In Z:\LBO\Clean:

  • compustatnames.txt: List of distinct company names from COMPUSTAT pull. Used to generate stdcompnames.txt
  • LBO COMPUSTAT PULL 4.txt: Result of COMPUSTAT pull
  • LBOdata_sheetX.txt: See LBO_list_and_data.xlsx in E:\McNair\Projects\LBO
    • LBOdata_sheet1.txt: First sheet of LBO_list_and_data.xlsx as tab-delimited txt (NOTE: we might have included entries marked as excluded; this is dealt with in SQL code)
    • LBOdata_sheet2.txt: Second sheet of LBO_list_and_data.xlsx as tab-delimited txt
    • LBOdata_sheet3.txt: Third sheet of LBO_list_and_data.xlsx as tab-delimited txt
    • LBOdata_sheet4.txt: Fourth sheet of LBO_list_and_data.xlsx as tab-delimited txt
  • stdcompnames.txt: Standardized COMPUSTAT company names (and non-standardized name lookup, gvkey). Used for matching to LBO and patent data.
  • matchorgname.txt: Cross-ref between standardized assignee names and raw assignee names
  • assigneexgvkey.txt: Name-based match between COMPUSTAT gvkey and raw assignee names
  • LBOmatchednames.txt: Name-based match between standardized LBO names and COMPUSTAT names; used to include additional matches of LBOs to COMPUSTAT companies

In allpatentsprocessed database:

  • lbocompustat4: See LBO COMPUSTAT PULL 4.txt
  • compustatnames: See compustatnames.txt
  • stdcompnames: See stdcompnames.txt
  • LBOdata: First sheet of Brian's data; main dataset
  • LBOsheet2: Second sheet of Brian's data; used to construct gvkey-to-dealnumb link
  • yearsbygvkeys: Composite primary key (all years 1970 to 2015 cross-producted with unique gvkeys); all other data is joined onto this
  • gvkeyxdealnumbfinal: Final gvkey to dealnumb lookup table. Includes results of name-based matching.
  • yearsbygvkeysfilled2: Compustat dataset with LBO events flagged

Construction of Dataset (SQL)

See documentation in E:\McNair\Projects\LBO\Clean\LBO Innovation SQL Script.txt

NOTE: All following work is done in the postgresql database "allpatentsprocessed"

Basic outline is as follows:

  • Imported COMPUSTAT Data
    • Built composite primary key from unique gvkeys cartesian-producted with years 1970 to 2015 (all other datasets will be joined onto this)
      • This is yearsbygvkeys in SQL database
    • Stripped out list of unique company names from COMPUSTAT data
    • Used matcher to construct standardized company names, with lookup table to original names
    • Joined gvkeys back into lookup table, result exported as stdcompnames.txt
  • Imported Brian's hand-cleaned SDC data on LBOs
    • Sheet 1 contains main list of LBOs
    • Sheet 2 contains gvkey to dealnumb link
    • Sheet 3 contains COMPUSTAT output (unused)
    • Sheet 4 contains private-to-private LBOs
  • Used SDC data to flag LBO events for COMPUSTAT data
    • Goal is to create gvkey lookup for each dealnumb, then query for number of deals associated with each gvkey in each year
    • First used appropriate columns from sheet 2 to make preliminary lookup table
  • Note: This left many LBOs unmatched. Applied name-based matching to include more deals (esp. all private-to-private)
    • Copied dealnumb and company name from LBO_List_and_data.xlsx into textpad
    • Manually removed extraneous symbols and words, saved result as LBO_Cleaned_data.xlsx
    • Performed matching by company name with stdcompnames.txt (which included gvkeys) to get alternative gvkey-dealnumb lookup table
    • Appended non-contradictory results to previous gvkey-dealnumb lookup table (to make SQL table gvkeyxdealnumbfinal)
    • Used this table to flag LBO events in Compustat dataset for appropriate gvkeys in appropriate year(s)
      • Result is SQL table yearsbygvkeysfilled2
  • Used USPTO patents database to tag yearly patent stocks and flows for all companies in COMPUSTAT
    • Matched the list of standardized patent assignee names with list of standardized COMPUSTAT company names (links gvkey to patent assignee names); result is cross-ref table assigneexgvkey
    • Used table patentstartend (which has start and end dates of when patent was active; see subsection below) and assigneexgvkey to determine, for each gvkey-year combination, the number of active patents
    • Result is SQL table yearsbygvkeysfilled3
  • Final dataset is exported as tab-delimited file STATAstaging4.txt

Patent start-end table

To construct the table of all patents and their grant dates and expiration dates (i.e., start-end of being an active patent), we did the following:

(code/detailed documentation is in E:\mcnair\projects\LBO\Clean\Building the Patent Start End Table.txt)

  • Imported all patent maintenance fee events from the file MaintFeeEvents_20160613.txt
  • Imported lookup table patentExpirationRules.txt containing patent lifetime implied by each maintenance event
  • For each patent, found the latest expiration date implied by any recorded maintenance event.

STATA analysis

Data Pulls


(Query saved in WRDS as LBO COMPUSTAT PULL 4)


Date Range: 1970-01 to 2016-10
Selection: Entire Database
Output: Tab-delimited
Date Format: YYMMDDs10
Saved in: Z:\LBO\Clean\LBO COMPUSTAT PULL 4.txt


Company Name
Ticker Symbol
Stock Exchange Code
Fiscal Year-End
ADDZIP -- Postal Code
CITY -- City
IPODATE -- Company Initial Public Offering Date
LOC -- Current ISO Country Code - Headquarters
NAICS -- North American Industry Classification Code
SIC -- Standard Industry Classification Code
STATE -- State/Province
FYEAR -- Data Year - Fiscal
ACT -- Current Assets - Total
ARTFS -- Accounts Receivable/Debtors - Total
AT -- Assets - Total
CEQ -- Common/Ordinary Equity - Total
CHE -- Cash and Short-Term Investments
DLC -- Debt in Current Liabilities - Total
DLTT -- Long-Term Debt - Total
INTAN -- Intangible Assets - Total
LCT -- Current Liabilities - Total
LT -- Liabilities - Total
RE -- Retained Earnings
TXDITC -- Deferred Taxes and Investment Tax Credit
DVC -- Dividends Common/Ordinary
DVP -- Dividends - Preferred/Preference
EBITDA -- Earnings Before Interest
NI -- Net Income (Loss)
OIBDP -- Operating Income Before Depreciation
OPITI -- Operating Income - Total
REVT -- Revenue - Total
SALE -- Sales/Turnover (Net)
TXT -- Income Taxes - Total
XINT -- Interest and Related Expense - Total
XRD -- Research and Development Expense
XSGA -- Selling, General and Administrative Expense
XSTFWS -- Staff Expense - Wages and Salaries
ESUBC -- Equity in Net Loss Earnings
IBC -- Income Before Extraordinary Items (Cash Flow)
XIDOC -- Extraordinary Items and Discontinued Operations (Cash Flow)
CSHO -- Common Shares Outstanding
DT -- Total Debt Including Current
EMP -- Employees
PRCC_F -- Price Close - Annual - Fiscal


There are three distinct pulls from SDC. Session details should be included as the second sheet of the respective Excel worksheets.

  • Brian Ayash's hand-cleaned list of LBOs (with gvkeys). This exists as the list_and_data.xlsx table.
  • List of failed LBOs. This exists as the failedLBOs2.xlsx table.

Old Version


This academic paper investigates the recent trend of leveraged buyouts (LBOs) in the US financial market and assesses their impact on innovative activities within firms that undergo LBOs. To do this, we match firms that undergo LBOs with similar firms that do not undergo LBOs using a dynamic hazard rate model. Using these matched firms as synthetic controls, we perform a difference in difference analysis to compare the trajectory of innovative activity of LBO firms with that of non-LBO firms (e.g., patenting, intangible assets).

See Leverage Buyout Innovation (Clean) for reboot of this project with clearer documentation.

Research Outline

How to measure innovation/effect on innovation

  • Patents:
    • Grants - Flow vs. stock
    • Patents sold (re-assignments)
    • Patents securitized?
  • Accounting variables:
    • R&D spending (spotty coverage, Brian's data and COMPUSTAT)
    • Intangible assets (Brian's data and COMPUSTAT)

Factors influencing LBO impact

  • Industry (note - depends on distribution)
    • Manufacturing
    • Services
    • High tech?
  • Type:
    • Public to private (main focus, ~330obs)
    • Private to private (23 in final STATA set for analysis)
      • Only 28 of ~60 private deals in SQL have patents
  • Time period
    • 1980s, 90s, 00s, and teens (currently teens in with 00s)

Note: Buyout type (MBO/LBO), Exit type (IPO vs. Secondary vs. Strategic Buyer, Financial Buyer, Bankruptcy, Still Private, Out of Court Rest.), Innovator vs. Flipper (how do we decide?) and other categorizations might be useful?

Work Scripts


See LBO Data Description for detailed work.

  • E:\McNair\Projects\LBO\LBO academic paper script pre compustat.txt
  • E:\McNair\Projects\LBO\LBO academic paper script post compustat.txt


See LBO Data Description for detailed work.

  • E:\McNair\Projects\LBO\
    • Script for generating data labels, compound variables and lagged variables, then setting dataset as panel survival analysis data
  • E:\McNair\Projects\LBO\
    • Script for computing summary statistics and regressions (eventually)

Work Description

LBO Data and Patent Data

  • Get original LBO Data, clean it, and match it against itself
  • Get patent data, draw the distinct assignees, and match it against itself. Note - Redo this step with new patent data
  • Match the LBO data to the patent data
  • Pull patent numbers and grant dates and join those variables to matched LBO patent data
  • Final table includes standardized names, number of patents, minimum grant years, maximum grant years, average grant years, and whether the LBO was public to private or private to private

Introduction of Compustat and Preparation for Stata

  • Pull Compustat data, match Compustat names against themselves to standardize
  • Connect Compustat data to LBO Data using the GVKeys in the SDC sheet of the original LBO excel file
  • Connect Compustat data to LBO data and Patent data in a single table where each company has 47 rows of data, one for each year 1970 - 2016
  • Final table includes standardized names, GVKeys, years, number of patents per year, LBO entries/exits in each year, and all pulled Compustat variables

Stata Work

  • Generate variables, compound and lagged
  • Set data up as panel
  • Generate hazard model
  • Conduct statistical analysis

Project Sub-Pages

LBO Lit Review

LBO Data Description