BookmarkSubscribeRSS Feed
yellowredyellow
Calcite | Level 5

I have dataset in the form

DATA details;
input id doj$;
datalines;
1 11/06/2020
1 18/07/2020
1 18/07/2020
2 12/06/2020
2 19/06/2020
3 13/06/2020
4 14/06/2020
5 15/06/2020
5 15/06/2020
;


Run;

(this dataset is sorted by ID and doj).

I would like an output table that gives:

IDdoj
118/07/2020
118/07/2020
219/06/2020
313/06/2020
414/06/2020
515/06/2020
515/06/2020

 

It looks at each unique ID, and then takes the latest one based on "doj" and takes multiple records if the "latest" is repeated. This is a simple example, but there can be more columns in the source table, and also many more dates for each id.

3 REPLIES 3
SASKiwi
PROC Star
DATA details;
input @1 id $2. @3 doj ddmmyy10. ;
format doj ddmmyy10.;
datalines;
1 11/06/2020
1 18/07/2020
1 18/07/2020
2 12/06/2020
2 19/06/2020
3 13/06/2020
4 14/06/2020
5 15/06/2020
5 15/06/2020
;
Run;

proc sql;
  create table Want as
  select A.*
  from details as A
  inner join
  (select  id
          ,max(doj) as doj_max
   from details
   group by id
  ) as B
  on A.id = B.id
  and A.doj = B.doj_max
  ;
run; 
Kurt_Bremser
Super User

Do not store dates as character strings, store them as SAS date values.

DATA details;
input id doj :ddmmyy10.;
format doj yymmdd10.;
datalines;
1 11/06/2020
1 18/07/2020
1 18/07/2020
2 12/06/2020
2 19/06/2020
3 13/06/2020
4 14/06/2020
5 15/06/2020
5 15/06/2020
;

data want;
do until (last.id);
  set details;
  by id;
  maxdate = max(doj,maxdate);
end;
do until (last.id);
  set details;
  by id;
  if doj = maxdate then output;
end;
drop maxdate;
run;
ballardw
Super User

@yellowredyellow wrote:

I have dataset in the form

DATA details;
input id doj$;
datalines;
1 11/06/2020
1 18/07/2020
1 18/07/2020
2 12/06/2020
2 19/06/2020
3 13/06/2020
4 14/06/2020
5 15/06/2020
5 15/06/2020
;


Run;

(this dataset is sorted by ID and doj).

I would like an output table that gives:

ID doj
1 18/07/2020
1 18/07/2020
2 19/06/2020
3 13/06/2020
4 14/06/2020
5 15/06/2020
5 15/06/2020

 

It looks at each unique ID, and then takes the latest one based on "doj" and takes multiple records if the "latest" is repeated. This is a simple example, but there can be more columns in the source table, and also many more dates for each id.


Did you look at the result from running your example data step code? Proc print for that data set yields:

id doj
1 11/06/20
1 18/07/20
1 18/07/20
2 12/06/20
2 19/06/20
3 13/06/20
4 14/06/20
5 15/06/20
5 15/06/20

 

So, not only are you using CHARACTER values for something that should be a SAS date value, you don't even read the entire value so there is no way, with your data step, to generate the desired output as you have lost the last to characters of the year.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 897 views
  • 0 likes
  • 4 in conversation