Silicon Valley Bank Data Project

Jump to navigation Jump to search

In January 2011, two representatives Silicon Valley Bank gave a presentation at the Haas business school, organized by Jerry Engel of the Lester Center for Entrepreneurship and Innovation. The purpose of this presentation was to explore the possibility of having Haas PhD students and faculty conduct research with the SVB's data. This page details findings from that meeting and the follow-up on-site meeting between SVB and a Haas representative, Ed Egan.

The January 18th Meeting (at Haas)


From Haas:

  • Jerry Engel (, Faculty Director, Lester Center for Entrepreneurship and Innovation
  • Toby Stuart (, Visiting Faculty from Harvard Business School
  • Javed Ahmed (, PhD Candidate in finance
  • Ron Berman (, PhD Student in marketing
  • Ed Egan (, PhD Student in BPP
  • Orie Shelef (, PhD Student in BPP

The following are interested in this data but were unable to attend the meeting:

  • Sharat Raghavan (, PhD Student in BPP
  • Neil Thompson (, PhD Candidate in BPP

From SVB:

  • Michael Graham ‎(, Senior Managing Director of SVB Analytics
  • Dave Krimm‎(, Head of Strategy and Research for SVB Analytics

Where: Haas School of Business

Notes From the Meeting

Joining Data

The meeting opened with an impassioned plea from Toby, echoed immediately by the other academics, to be allowed to join the data to other datasets. This could be accomplished in a number of ways, including leaving 'identity identifiers' such as either names or numbers that are linked to names, in the data. SVB did not seem adverse to this.

Obvious examples of dataset which would be joined to the data include:

  • Thompson VentureXpert
  • SDC Mergers and Acquisitions
  • Global New Issues data
  • The NBER Patent Data, or other patent data
  • Bankruptcy data

Generally, joined data would be available to Haas students/faculty through our library licenses, but not to SVB. SVB has a license to Dow Jones' VentureSource database, though their license does not permit them to see the identities of firms.

SVB Datasets

SVB has three datasets that they are considering sharing with us in some fashion. These are:

  • The Valuations Data
  • The Benchmarking Data
  • The CAPMX data
The Valuations Data

Michael estimated that that SVB has 'valuations' data on 2600 early stage firms, with many firms having multiple valuations conducted over time. Internal Revenue Code 409A requires that there be no discrepancy between an options value and the value of common stock, in part to prevent issue with backdating of options, and that valuations be conducted by a third-party at an arm's length from either the 'service recipient' (i.e. employee/executive/etc) and the 'service provider' (i.e. the firm). Thus firms may have approached SVB to provide them with valuations on the firms common stock potentially every time that there is an event, such as a stock option issue, which would require compliance. SVB has been collecting this data for approximately 4 years.

SVB is interested in (co-)authoring an article on valuation for publication in a (trade) journal. Michael has noticed that option pricing models (i.e. Black-Scholes models) lead to over-valuation of the stock, especially for non-participating preferred stock, and that the use of 'mulitple models' (i.e. those that use simple 1x-2x, 2x-3x, 4+ x valuations) are far more accurate valuations, and would like assistance in exploring this.

The Benchmarking Data

SVB Analytics was looking to answer "what is the next business for SVB?" Their conclusion was that they should launch a benchmarking service for their clients. They have assembled a large proprietary database of financial statements for privately-held, predominantly venture capital-backed, firms in certain sectors (specifically life science, software and cleantech). They estimated that they have one or more financial statements for approximately 50% of the venture-capital backed firms in these certain sectors, that were active in 2010.

A primary service for SVB is lending, whether in the form of loans, credit cards, or other credit, to venture capital backed startups. These loans are essentially guaranteed by the reputation of the investing venture capitalists, and data on these loans was not discussed or offered to us. However, each time that a firm applies for a loan, or has some other credit-event, this triggers a request for a full set of financial statements. The bank has been collecting these statements and is now rolling out a 'benchmarking' service, which allows firms to compare their performance on various financial measures against aggregate data on their 'peers'.

The SVB VentureSource license allows SVB to provide aggregate venture capital data to their clients, and integration of this data into the benchmarking data is being considered by SVB.

The CapMx Data

One core service that the bank offers to its clients is the management of their firm's capital tables. The CapMx database was estimated to have approximately 2000 users, and contains details on the capital structure of the firm including common stock outstanding, preferred stock outstanding, warrants and stock options, employee share option plans, and liquidation preferences. The users of this data are both the firms themselves, and accounting/law firms that work for these firms. The tables are 'initialized by agents' and then accessed by the firms.

The February 4th Meeting (at SVB)


This write up contains some subjective interpretation of the facts, because I believe that this is useful. However, it is possible that I might be error in my understanding, and where this is relevant I note it with the bracketed moniker [EJE].


From Haas:

  • Ed Egan (, PhD Student in BPP

From SVB:

  • Michael Graham ‎(, Senior Managing Director of SVB Analytics
  • Dave Krimm‎(, Head of Strategy and Research for SVB Analytics
  • Dan Zaelit ‎(‎, SVB Analytics
  • Jan, a manager partially responsible for the CapMx data entry

Where: SVB San Francisco office - 8th Floor, 555 Mission St, San Francisco.

Background on SVB Analytics

SVB Analytics is a 'non-bank affiliate' of SVB. It runs two for-profit services: the valuation services, and the equity compensation management services (CapMx). It also provides the benchmarking service (which is currently being rolled out [EJE]), and other advisory/industry reporting services, that are run primarily to increase awareness of the bank and its other services [EJE]. Crucially, the SVB analytics group directly 'controls' the valuation data, whereas the data for the benchmarking service (the financial reports) is generated and maintained elsewhere in the bank. The CapMx data appears to fall between these two - with the R&D group running the underlying databases and SVB Analytics providing the service [EJE].

The organizational hierarchy is: Michael <- Dave <- Dan. Michael has the authority to distribute the data to, or to otherwise engage in a relationship with, Haas [EJE]. Michael may report to Iris Hit-Shagir, the president of SVB Analytics [EJE]. A new individual has been/is being hired, possibly with the title of Director of Technology, to work with Dave, but has not yet started [EJE].

The Benchmarking Data

The bank's loan clients must file either monthly or quarter financial statements (depending on how often they issue statements) with the bank. These are aggregated into quarterly statements for all loan recipients. From 2004 forward, these statements have had various financial variables extracted from them and stored in a centralized electronic repository. Starting in 2008 the 'granularity' of data, that is the number and fineness of financial variables was dramatically increased. Nevertheless the major variables (Sales, Total Assets, Net Income, etc) are available back to 2004. Expense variables such as Sales and Marketing (S&M) expenses, R&D expenses, and General and Administrative (G&A) expenses are broken out starting in 2008.

Aside from financial statement data, the data also includes:

  • A proprietary but extremely fine-grained industry classification schema that has been mapped to VentureSource's classification schema. The schema has a three level depth: Segment, Area, Niche.
  • The location of the firm: ZIP Code, State and Region.
  • Year founded
  • Stage of development

The data exists in two databases: an operational database, and a development database which is populated by quarterly draws from the the operational database. Both databases run on an Oracle platform and the SVB Analysis group uses the Toad client to run SQL statements on it. The development data is validated and cleaned manually by SVB staff, and we could expect draws from this source. The data lives in a series of (flat) tables, with the main table containing the financials keyed as CompanyName-DateOfFinancials, and other tables, such as for the year of founding, keyed by CompanyName.

SVB takes data from the validated and cleaned development database and uploads this onto their online Birst based web-platform, to provide their benchmarking service. Companies accessing this data through the Birst interface are restricted the selecting aggregate benchmarking portfolios to compare their firm against. The interface allows them to construct portfolios on the basis of geography, industry, comparability in terms of financial ratios, and so forth, and reports the size of the comparing portfolio as either 5-30 firms, 31-100 firms, and so forth. Firms are restricted to seeing comparison portfolios composed of at least 5 firms. SVB is trying to advance this service into a CEO desktop tool, which will report things like Josh James' Magic Number - this requires fine grained data as well as uninterrupted sequential financial statements, which is surely good news for researchers going forward.

The process of uploading the data to Birst is as follows:

  1. Within one month of the quarter end financial statements are sent to India
  2. The contractor in India 'converts' the data into electronic format within one month
  3. SVB Analytics staff clean and validate the data.
  4. Within three months of the quarter end the data is uploaded to the Birst web-platform

The first two steps above should be elaborated. The CRM (Customer Relationship Management) team gets the filings and notes the stage of their relationship with the customer on it. The 'Credit Team' (which is partly in the US and partly in India [EJE]) then uses Moody's Risk Analyst to 'spread' the financial statements into a database. Data is then 'reshaped' to clean it up and to remove "prospects" sections and other data that for contractual reasons are restricted; the spreading and the reshaping combined result in a loss of granularity, but standardization. Firms are tagged with a CIF number, which is unique to the firm, as well as a Customer ID. There may be N-1 relationship between Customer ID's and firms, as some customers may be responsible for many firms, however, most of these are apparently 1-1. Each statement is then assigned a Statement ID. In addition, the credit team assigns a "risk-stage" code to each statement.

When the SVB Analytics staff get the data back they get a table called "T_Moodys" which contains all of the financials keyed by both CIF and StatementID. They then clean this up further, removing duplicate statements and errors, and add extra codes, including a "development stage code".

There is a plan in progress to shorten this process and to move to monthly financial data. Specifically, SVB are considering allowing/facilitating input from Quickbooks and other accounting systems, to get electronic data in predetermined formats directly from their firms.

It seems possible that Haas could enter into an agreement to get a feed of this data simultaneous with the upload to the Birst platform [EJE].

Financial statements can also be accumulated by SVB for reasons other than credit issuance. These include the Entrepreneur Services group that matches (apparently with a 20% success rate [EJE]) VCs to Entrepreneurs, as well as the Emerging Technologies Team that examines sectors for growth. At present these financial statements do not make it into the system described above. However, the Emerging Technologies Team does use the development database, and does mark it with potentially useful codes.

At present SVB gets draw downs of data from VentureSource every six months. They use this data to:

  • Validate their own coverage
  • Provide reprots to clients (they have 328 charts made every time by a group in India for use in presentations)
  • Potentially to enhance their benchmarking service (though this has not been fully implemented as yet [EJE]).

The CapMx Data

The CapMx data is a electronic collection of firms' capital tables. In theory it contains the fully detailed capital structure for each firm in the database. The bank works with various law firms, including Orrick and Gunderson, who 'stored' this data in Excel sheets, as well as venture capital funds and the companies themselves (who 'stored' this data on paper). The bank responded to this situation by creating and maintaining the CapMx database, to store the data on behalf of their various clients, and to facilitate various transactions on this data.

You can fill out a form to view a demo of the online interface into the CapMX data. This online interface provides users with access to their data, and allows the users to conduct a limited number (and range) of transactions on it.

The database can accommodate data on the authorized capital of the firm, all transactions performed on this capital, individuals/entities associated with this capital and/or these transactions, and details on the firm itself. However, there is little to no incentive for the firms or representatives of the firms to fill out any data beyond that which is required to record and use the capital structure at a particular moment in time.

The original capital structure of the firm is inputted by staff at SVB. There is an 'upload template' to guide the staff in this process, and data is sourced from the certificate of incorporation (or certificate of authorization for later issues). However, the input suffers from a number of issues:

  • There is free text input for the class of stock, so that COMMON, Common, COM, Common1, etc, are all possible ways of recording a Common Stock entry.
  • It is not possible to close out a series of stock
  • There is no standardization on the naming of series (SERIES A, etc), and in the case of recapitalizations or reorganizations it is possible to have two series named say SERIES A and SERIES 1.
  • Preferences may not reflect the most recent series
  • Preferences are often recorded as a static value (taken from the certificate)
  • Likewise, the conversion ratio is recorded as a number, rather than an formula (1:1 conversions are less problematic)
  • There may be issues with the recording of preference priorities, participation, and payment-in-kind clauses
  • It would be highly problematic to record capital structures for firms organized as LLCs or LPs (or anything other than a C-CORP or S-CORP).

Records are updated in two ways:

  • At every financing round the bank updates the underlying capital structure
  • By conducting transactions on the underlying structure through the online interface the structure can be automatically updated. (Note that in order to conduct a transaction on the capital structure, the components of the capital structure relevant to the transaction must be up to date. Thus transactions can act to validate the underlying structure data.) A full history of every transaction ever conducted is maintained throughout the firm's lifetime.

Firms may conduct transactions related to:

  • Stock
  • Options
  • Warrants
  • Convertible Promissory Notes (CPN)
  • SPN
  • Rollbacks

Transactions might include:

  • Issuances
  • Redemptions
  • Cancellations
  • Exercisings
  • Repricings

Firms may also examine issues for Rule 701 Compliance, conduct (one-stage) anti-dilution analysis, conduct (one-stage) liquidation preference analysis, compute the firm's fair market value, and build a wide variety of reports on the capital structure or transactions conducted upon the capital structure. (Note that by "one-stage" I mean that the data can be taken from the system and a single hypothetical analysis can be performed, and that it is not possible to temporarily store the changed capital structure that would result from the hypothetical analysis in order to conduct further hypothetical analysis. This does limit the usefulness of the tool with regard to simulation of the capital structure over time, such as for an exit analysis. SVB have recognized this limitation and are considering potential improvements [EJE]). No permanent record is kept of the one-stage analyses, or the reports produced.

The database contains 4032 firms records, of which 924 are 'blanks' which are created when people test or demo the system. There are additional firms named "X - In Conversion", "X - Expensing Only", and "X - Dissolved", (where X is string) that should be discarded. Thus there are a little under 3000 active firms on the system. Of these only 339 have completed some information on the firm beyond that which is required to maintain and use a record of the firm's capital structure. The database uses its own IDs (separate from the CIF IDs mentioned above), but the firm name's are recorded directly from the Articles of Incorporation and should be sufficient to fully identify the firms.

The database runs on an Oracle platform, but the actual structure of the data in the database was not discernable. The R&D group of SVB is responsible for the operation and maintenance of the underlying data. However, it seems very likely that a SQL script could pull all relevant data into a single flat file for analysis.

The Valuations Data

Providing valuations to clients is a major business task for SVB Analytics. Valuations of portfolio companies by funds are required to be done at an arm's length and to use 'fair market values', to comply with FAS 157, annually. Likewise, annual valuations are required for the firms themselves to comply with Internal Revenue Code 409A. Valuations are also required whenever there is 'trigger event', such as the issuance of stock options. However, clearly not all clients of the bank (or all firms in the industry) use SVB's valuation services.

Valuations are done using templated Excel spreadsheets. The sheets have the following elements:

  • The header: States the firm's name, industry and other information
  • Financial Info:
    • Actual - Taken from past annual statements
    • Calculated - Ratios and other calculated variables
    • Forecast - Annual forecasts for the next 1-3 years (typically). These forecasts are certified as accurate by the CEO.
  • Capital Structure of the Firm (may be drawn from CapMx)
  • Public Comparables: Lists comparable publicly held firms. The CEO acknowledges the comparability.
  • Summary of Valuation: Details the methods tried and (in which blend) used. Gives the equity value and hence the price per share.
  • Valuation Methods: Details the calculations of the various valuation methods. Methods include:
    • CVM (Current Value Method), a.k.a. Multiples Analysis (uses M&A Comparables)
    • Net Present Value (NPV) of future discounted cash flows
    • Weighted Average Cost of Capital (WACC)
    • PWERM (Probability Weighted Expected Return Method)
    • OPM (Option Price Method)

At present there are approximately 450 'valuations' in the database, where each valuation was converted from a spreadsheet into a series of entries in database tables. There are approximately another 1500 valuations in the pipeline waiting to be extracted from spreadsheets into the database. It is anticipated that this set will be complete in about 3 months [EJE]. The current valuations are mainly from late 2009 and early 2010. The pipeline contains all valuations performed since 2008. Again, the data input/conversion process relies on outsourcing to India. It is unclear what format the final database will be in, but a SQL compliance DB seems almost a certainty.

A Relationship with SVB

SVB accumulated this data for three reasons:

  1. Helping their clients make better decisions (though they are careful not to breach confidentiality requirements with the data) is a service business to them
  2. To enhance internal decision making, particularly with regard to business risk lending, but also to establish best practices within the bank
  3. To use it as a component of a brand building strategy

SVB believe [EJE] that Haas could contribute the brand building strategy by producing (practitioner relevant) research that would be published in (trade) journals and/or presented at (trade) conferences. It seems possible that we could also contribute to the first two reasons directly. Specifically, PhD students, in the course of accessing and processing data for research, could provide recommendations for commercial applications of the data that are current unrecognized or unexploited.

A request was made for examples of journals and conferences that SVB would find useful/relevant. It seems likely [EJE] that suitable journals/publications include:

  • Business Valuation Review
  • The Journal of Business Valuation
  • The Valuation Law Review
  • The Journal of Private Equity
  • Inc Magazine
  • The Wall Street Journal

SVB currently envisage the following access protocol:

  1. Haas researchers pitch research projects to SVB personnel
  2. Projects are examined on a case by case basis
  3. Researchers are given on-site data access under an employee style contract to conduct approved research.

Going forward, we should contact Dan with questions regarding data points (I am happy to do this if further information is needed), and probably Michael for faculty-to-bank relationship questions.