Changes

Jump to navigation Jump to search
*Replaced AllRealMatchKeysC20Code with ComboKeys_Code20, also renamed realmatch variable to isreal.
**Note that ComboKeys_Code20 (591,299 with 554,561 synths) is much smaller AllRealMatchKeysC20Code (1,631,896 with 1,599,427 synths), which allowed (almost) any other firm from RLJoinerFF that had done a deal in that code20-year. ComboKeys_Code20 only allows another real match from the same code20-year.
*Rebuilt Super tables (PortCoSuper, DealSuper, FirmSuper). Note that FirmSuper is now restricted to US firms only (matches were already US-US because MatchMostNumerous was constrained to US (and state !='UN') firms and portcos only. However, it is not clear that this was true in the past.).
*Rebuilt the AllMatchc20 tables. New names are Combo..._Code20.
**Crucial difference: Only investments within ComboKeys_Code20 are included in the history counts. Before anything in RLJoinerFF or even RoundlineBase were included, which created the impression of overcounts.
**The vast majority of missing distances were caused by missing firm addresses for just 391 VCs(some of these were state UN, and later removed). However, we have zip codes for almost all (?) of them. I build built firmbogeoplus (see [[VCDB20]]) to add in zcta centroids from the U.S. Census gazetteer where available. However, 47 zips weren't in the ZCTA lookup, and they account for 113 firms that participated in 3,997 pairs (real and synth). So I also ran GeocodeOneKey.py with a (no header) Zip\tZip input (MissingFirmZips.txt), manually added three records (02801,85292,91399) and loaded up the result as zipgeoaddon (from MissingFirmZips-Geocoded.txt) in Load.sql.

Navigation menu