Hi all,
I created a table and filtered out the rows which have blanks in columns gvkey , ibitic , isin. Now I want to delete all the duplicate rows which have identical data for column gvkey. In other words... I would like to keep just the rows with unique values for gvkey
My code until now:
PROC SQL;
CREATE TABLE WORK.COMP_DataUS AS
SELECT gvkey , ibtic , isin , sedol
FROM COMP.SECURITY;
RUN;
QUIT;
data WORK.COMP_DataUS;
set COMP_DataUS;
where not missing(gvkey) AND(ibtic) AND(isin);
run; /* Output: 24,936 rows */
this is how my dataset looks like:
Basically I would like to have a table without the the blue columns (but for the whole table and not just for these two examples).
Thanks in advance for the support.
Best regards
Jorge
I'm hoping this will work for you !!
proc sort data=COMP_DataUS nodup out=want; by gvkey; where gvkey is not missing; run;
UNTESTED CODE
Assumes the data set COMP_DATAUS is sorted by GVKEY
proc freq data=comp_dataus;
table gvkey/noprint out=_a_;
run;
data want;
merge comp_dataus _a_;
by gvkey;
if count>1 then delete;
run;
I'm hoping this will work for you !!
proc sort data=COMP_DataUS nodup out=want; by gvkey; where gvkey is not missing; run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.