Hi all,
I created a table and filtered out the rows which have blanks in columns gvkey , ibitic , isin. Now I want to delete all the duplicate rows which have identical data for column gvkey. In other words... I would like to keep just the rows with unique values for gvkey
My code until now:
PROC SQL;
CREATE TABLE WORK.COMP_DataUS AS
SELECT gvkey , ibtic , isin , sedol
FROM COMP.SECURITY;
RUN;
QUIT;
data WORK.COMP_DataUS;
set COMP_DataUS;
where not missing(gvkey) AND(ibtic) AND(isin);
run; /* Output: 24,936 rows */
this is how my dataset looks like:
Basically I would like to have a table without the the blue columns (but for the whole table and not just for these two examples).
Thanks in advance for the support.
Best regards
Jorge
I'm hoping this will work for you !!
proc sort data=COMP_DataUS nodup out=want; by gvkey; where gvkey is not missing; run;
UNTESTED CODE
Assumes the data set COMP_DATAUS is sorted by GVKEY
proc freq data=comp_dataus;
table gvkey/noprint out=_a_;
run;
data want;
merge comp_dataus _a_;
by gvkey;
if count>1 then delete;
run;
I'm hoping this will work for you !!
proc sort data=COMP_DataUS nodup out=want; by gvkey; where gvkey is not missing; run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.