Changes

Jump to navigation Jump to search
1,068 bytes added ,  12:48, 27 June 2017
iponoduplicates is now the master ipo table that contains unique keys which are concatenated from issuer, issuedate and prinicpalamt. It should contain 9481 lines.
==Cleaning MAsmas table==There were duplicate records in this file because the original text file from SDC contained addresses that were on two rows for some companies. Therefore the easiest way to get rid of these records is to remove lines in textpad and reimport into the mas table. Remove lines in mas textpad using regex: ^#.*\nand replacing with nothing.After reimporting there will still be duplicate rows. Slap an id number on them and take the min as shown below. CREATE TABLE mas1 AS SELECT *, concat(targetname::text, targetstate::text, announceddate::text) AS pkey FROM mas; ALTER TABLE mas1 ADD COLUMN id SERIAL PRIMARY KEY; --114890  DROP TABLE masminid; CREATE TABLE masminid AS SELECT targetname, targetstate, announceddate, MIN(id) FROM mas1 GROUP BY targetname, targetstate, announceddate; --114825  DROP TABLE masnodupes; CREATE TABLE masnodupes AS SELECT mas1.* FROM mas1 JOIN masminid ON mas1.id = masminid.min; \COPY masnodupes TO 'masnodupes.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV masnodupes should have no duplicates based on targetname, targetstate, announceddate.

Navigation menu