Hi, I am new to sas and I would like some help. I need to delete data that's outdated for the same user. I wanted to keep the record that the user is registered/most recent data. below is a sample data
Userid UserName Registered effective_date expiration_date
123 Mary Jane Y 1/1/2009 1/1/2014
123 Mary Jane N 1/1/2008 1/1/2014
234 John Doe Y 3/21/2010 1/1/2015
245 Jane Doe N 4/1/2007 1/1/2014
653 John Smith N 8/5/2010 1/1/2014
653 John Smith Y 9/14/2011 1/1/2014
900 Mike Adams N 1/1/2012 1/1/2015
This is the desired results
Userid UserName Registered effective_date expiration_date
123 Mary Jane Y 1/1/2009 1/1/2014
234 John Doe Y 3/21/2010 1/1/2015
245 Jane Doe N 4/1/2007 1/1/2014
653 John Smith Y 9/14/2011 1/1/2014
900 Mike Adams N 1/1/2012 1/1/2015
Thanks.
Step one: sort by userid and descending expiration_date
Step two: proc sort nodupkey by userid
Step one: sort by userid and descending expiration_date
Step two: proc sort nodupkey by userid
Just to clarify further since you are new to SAS and I think effective_date may be what you want in the first sort:
/*Sort the data in the right order */
Proc sort data=have ; by userid descending effective_date; run;
/*Allows proc sort to eliminate the duplicates and keeps the first row it finds (which is why you needed to sort it to begin with)*/
/** this sorts the above data and creates two new datasets: one with the deduped results (have2) and one with the rows that were removed (dups)**/
proc sort data=have out=have2 dup=dups nodupkey; by userid; run;
Hope this helps!
EJ
To keep it simple, based on the data you have provided try the following:
DATA HAVE;
LENGTH USERNAME $10;
INFILE DATALINES DLM=',';
INPUT USERID USERNAME $ REGISTERED $ EFFECTIVE_DATE:MMDDYY8. EXPIRATION_DATE:MMDDYY8.;
FORMAT EFFECTIVE_DATE EXPIRATION_DATE DATE9.;
DATALINES;
123,MARY JANE,Y,1/1/2009,1/1/2014
123,MARY JANE,N,1/1/2008,1/1/2014
234,JOHN DOE,Y,3/21/2010,1/1/2015
245,JANE DOE,N,4/1/2007,1/1/2014
653,JOHN SMITH,N,8/5/2010,1/1/2014
653,JOHN SMITH,Y,9/14/2011,1/1/2014
900,MIKE ADAMS,N,1/1/2012,1/1/2015
;
RUN;
PROC SORT DATA=HAVE OUT=SORTHAVE;
BY USERID EFFECTIVE_DATE;
RUN;
DATA WANT;
SET SORTHAVE;
BY USERID EFFECTIVE_DATE;
IF LAST.USERID;
RUN;
Thank you all for your help!
hi ... though you are a beginner, you can venture into PROC SQL ...
proc sql;
create table new as
select * from old
group userid
having effective_date = max(effective_date);
quit;
Should be group by userid
hi ... GROUP works without the BY (as does ORDER)
I suppose it might not be good practice to leave out the BY
Hi Mike,
It is good to know. Thank you!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.