Hello team,
I have a project that I need the most recent date. How can I use proc sort nodupkey take care of this matter?
What other approaches we have for keep only the most recent date?
If our date to be character date, does proc sort sort our data?
Your help is greatly appreciated.
blue blue
@GN0001 IF you would have posted sample data with the date variable containing a proper SAS date value then I assume the discussion here would have been much shorter/quicker.
IF you've got a date variable that sorts properly then a double Proc Sort is one way to go.
The first Proc Sort to order your data with the newest date per member id on top, the 2nd Proc Sort with a NODUPKEY to then pick the first row per member id (same that what a data step if first.member_id would do).
data have;
input MemberID $ date1:ddmmyy10.;
format date1 date9.;
datalines;
1 04/02/2021
1 04/03/2021
1 05/03/2019
;
proc sort data=have out=want;
by MemberID DESCENDING date1;
run;
proc sort nodupkey out=want;
by MemberID;
run;
Character dates in SAS are just text strings so they will sort in alphabetical order. You need proper SAS dates which are numeric variables with SAS date formats applied if you want your data to sort correctly in date order.
Please provide a sample of your data if you want to confirm your date selection rules. What happens with duplicate dates? Which date record will you choose?
@SASKiwi wrote:
Character dates in SAS are just text strings so they will sort in alphabetical order. You need proper SAS dates which are numeric variables with SAS date formats applied if you want your data to sort correctly in date order.
Please provide a sample of your data if you want to confirm your date selection rules. What happens with duplicate dates? Which date record will you choose?
@SASKiwi I think you meant "calendar order" or "chronological order" as they will sort in alphabetical order which often does not match calendar order.
Character "dates" in ISO8601 format would at least sort correctly, but (@GN0001) they are still useless for any type of calculation.
Hello team member,
This is my code:
Data test1; Input MemberID $ date1; datalines 1 04/02/2021 1 04/03/2021 1 05/03/2019 ; Run;
I want to use proc sort to keep one meberID with the max date. If I sort the data with proc sort with nodupkey, how can I get to the point that I can keep one memberId with Max Date?
Please let me know.
Respectfully,
blue blue
@GN0001 wrote:
Hello team member,
This is my code:
Data test1; Input MemberID $ date1; datalines 1 04/02/2021 1 04/03/2021 1 05/03/2019 ; Run;I want to use proc sort to keep one meberID with the max date. If I sort the data with proc sort with nodupkey, how can I get to the point that I can keep one memberId with Max Date?
Please let me know.
Respectfully,
blue blue
Run this code, inspect the log, and fix it, as a simple first lesson.
The first step, though, is to fix the text date to a number.
you can use proc sort with nodupkey and last. variable.
data have;/* sample data */
do id=5 to 1 by -1;
do i=20010 to 20000 by -1;
dt=i+int(ranuni(i)*100);
output;
end;
end;
format dt yymmdd10.;
drop i;
run;
proc sort data=have nodupkey;
by id dt;
run;
data want;
set have;
by id dt;
if last.id;
run;
or
Use proc sql
proc sql;
select * from have as a
where dt=(select MAX(dt) from have as b
WHERE a.id=b.id);
quit;
Of course they are dates in SAS. Dates are counts of days, with 1960-01-01 as day zero. Do this for reference:
data _null_;
date = today(),
put date=;
run;
and look at the log.
SORT on its own can't give you the max value, you need a follow-up step that selects this observation.
This syntax didn't work:
A subquery can not select more than one column... usually a subquery is used incorrectly in an expression
proc sql; select * from have as a where dt=(select MAX(dt) from have as b WHERE a.id=b.id); quit;
How can I fix this issue?
Regards,
Blueblue
Character dates are useless, you have to convert them first.
Do you you only want the date for later use, or the observation(s) with the latest date?
Hello,
I need a unique observation with a max date?
The date is already a number data type. I posted a dataline statement that shows date is a number. I do understand the date as a character is useless.
Respectfully,
Blueblue
Sort by descending date, and keep the first observation in a data step.
Or run a PROC SQL with
having date = max(date)
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.