Extract observations if it's the latest date

yellowredyellow · Posted 02-21-2023 10:15 PM

I have dataset in the form

DATA details;
input id doj$;
datalines;
1 11/06/2020
1 18/07/2020
1 18/07/2020
2 12/06/2020
2 19/06/2020
3 13/06/2020
4 14/06/2020
5 15/06/2020
5 15/06/2020
;


Run;

(this dataset is sorted by ID and doj).

I would like an output table that gives:

ID	doj
1	18/07/2020
1	18/07/2020
2	19/06/2020
3	13/06/2020
4	14/06/2020
5	15/06/2020
5	15/06/2020

It looks at each unique ID, and then takes the latest one based on "doj" and takes multiple records if the "latest" is repeated. This is a simple example, but there can be more columns in the source table, and also many more dates for each id.

SASKiwi · Posted 02-21-2023 10:31 PM

DATA details;
input @1 id $2. @3 doj ddmmyy10. ;
format doj ddmmyy10.;
datalines;
1 11/06/2020
1 18/07/2020
1 18/07/2020
2 12/06/2020
2 19/06/2020
3 13/06/2020
4 14/06/2020
5 15/06/2020
5 15/06/2020
;
Run;

proc sql;
  create table Want as
  select A.*
  from details as A
  inner join
  (select  id
          ,max(doj) as doj_max
   from details
   group by id
  ) as B
  on A.id = B.id
  and A.doj = B.doj_max
  ;
run;

Kurt_Bremser · Posted 02-22-2023 01:15 AM

Do not store dates as character strings, store them as SAS date values.

DATA details;
input id doj :ddmmyy10.;
format doj yymmdd10.;
datalines;
1 11/06/2020
1 18/07/2020
1 18/07/2020
2 12/06/2020
2 19/06/2020
3 13/06/2020
4 14/06/2020
5 15/06/2020
5 15/06/2020
;

data want;
do until (last.id);
  set details;
  by id;
  maxdate = max(doj,maxdate);
end;
do until (last.id);
  set details;
  by id;
  if doj = maxdate then output;
end;
drop maxdate;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

ballardw · Posted 02-22-2023 04:57 AM

@yellowredyellow wrote:

I have dataset in the form
DATA details;
input id doj$;
datalines;
1 11/06/2020
1 18/07/2020
1 18/07/2020
2 12/06/2020
2 19/06/2020
3 13/06/2020
4 14/06/2020
5 15/06/2020
5 15/06/2020
;


Run;
(this dataset is sorted by ID and doj).

I would like an output table that gives:

ID doj

1 18/07/2020

1 18/07/2020

2 19/06/2020

3 13/06/2020

4 14/06/2020

5 15/06/2020

5 15/06/2020

It looks at each unique ID, and then takes the latest one based on "doj" and takes multiple records if the "latest" is repeated. This is a simple example, but there can be more columns in the source table, and also many more dates for each id.

Did you look at the result from running your example data step code? Proc print for that data set yields:

id	doj
1	11/06/20
1	18/07/20
1	18/07/20
2	12/06/20
2	19/06/20
3	13/06/20
4	14/06/20
5	15/06/20
5	15/06/20

So, not only are you using CHARACTER values for something that should be a SAS date value, you don't even read the entire value so there is no way, with your data step, to generate the desired output as you have lost the last to characters of the year.

Extract observations if it's the latest date

Re: Extract observations if it's the latest date

Re: Extract observations if it's the latest date

Re: Extract observations if it's the latest date

Registration is open

Extract observations if it's the latest date

Re: Extract observations if it's the latest date

Re: Extract observations if it's the latest date

Re: Extract observations if it's the latest date

Registration is open

SAS Training: Just a Click Away