Changes

Jump to navigation Jump to search
1,294 bytes added ,  17:51, 6 August 2018
Since the count of firmbasecore and the DISTINCT query are the same, the firm table is now clean.
 
===Cleaning Branch Offices===
When cleaning the branch offices, I had to remove all duplicates in the table. This is because the table is so sparse that often the only data in a row would be the fund name the branch was associated with. Thus, I couldn't filter based on dates as I had been doing previously for firms and funds. The primary key is firm name.
 
DROP TABLE bonound;
CREATE TABLE bonound AS
SELECT *, CASE WHEN firmname LIKE '%Undisclosed Firm%' THEN 1::int ELSE 0::int END AS undisclosedflag
FROM branchoffices;
--10353
 
SELECT COUNT(*) FROM(SELECT DISTINCT firmname FROM bonound)a;
--10042
 
Since these counts aren't the same, we will have to work a little more to clean the table. As stated above, I did this by excluding the firm names that were duplicated.
 
DROP TABLE branchofficecore;
CREATE TABLE branchofficecore AS
SELECT A.* FROM bonound AS A JOIN (
SELECT bonound.firmname, COUNT(*) FROM bonound GROUP BY firmname
HAVING COUNT(*) =1
) AS B
ON A.firmname=B.firmname WHERE undisclosedflag=0;
--10032
 
SELECT COUNT(*) FROM (SELECT DISTINCT firmname FROM branchofficecore)a;
--10032
 
Since these counts are the same, we are good to go. The count is 10 lower because we completely removed 10 firmnames from the listing by throwing out the duplicates.
==Instructions on Matching PortCos to Issuers and M&As From Ed==
158

edits

Navigation menu