Hi all,
I created a table and filtered out the rows which have blanks in columns gvkey , ibitic , isin. Now I want to delete all the duplicate rows which have identical data for column gvkey. In other words... I would like to keep just the rows with unique values for gvkey
My code until now:
PROC SQL;
CREATE TABLE WORK.COMP_DataUS AS
SELECT gvkey , ibtic , isin , sedol
FROM COMP.SECURITY;
RUN;
QUIT;
data WORK.COMP_DataUS;
set COMP_DataUS;
where not missing(gvkey) AND(ibtic) AND(isin);
run; /* Output: 24,936 rows */
this is how my dataset looks like:
Basically I would like to have a table without the the blue columns (but for the whole table and not just for these two examples).
Thanks in advance for the support.
Best regards
Jorge
I'm hoping this will work for you !!
proc sort data=COMP_DataUS nodup out=want; by gvkey; where gvkey is not missing; run;
UNTESTED CODE
Assumes the data set COMP_DATAUS is sorted by GVKEY
proc freq data=comp_dataus;
table gvkey/noprint out=_a_;
run;
data want;
merge comp_dataus _a_;
by gvkey;
if count>1 then delete;
run;
I'm hoping this will work for you !!
proc sort data=COMP_DataUS nodup out=want; by gvkey; where gvkey is not missing; run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.