BookmarkSubscribeRSS Feed
newuser2022
Calcite | Level 5

I have a big dataset and which contain id's and daily values for it for a few years. I want to select for each id's the maximum day for each month for this few years. 

ID DATE

2  01JAN202

2  02JAN2020

2  03JAN2020

2  22FEB2020

2  24FEB2020

3  01JAN2020

3 02JAN2020

3 22FEB2020

3 24FEB2020

4 01JAN2020

4 02JAN2020

4 22FEB2021

4 23FEB2021

4 25MAR2021

4 26MAR2021

4 27MAR2021

 

For the above sample I need to get the data as below:

2  02JAN2020

2  24FEB2020

3 02JAN2020

3 24FEB2020

4 02JAN2020

4 23FEB2021

4 27MAR2021

 

 

6 REPLIES 6
PaigeMiller
Diamond | Level 26

Assuming the data are numeric and not character (are they? you didn't say) this is a job for PROC SUMMARY

 

proc summary data=have nway;
    class id date;
    format date monyy7.;
    var date;
    output out=want max= ;
run;
proc datasets library=work nolist;
    modify want;
    format date date9.;
run;
--
Paige Miller
PaigeMiller
Diamond | Level 26

@Kurt_Bremser wrote:
proc sql;
create table want as
  select id, date
  from have
  group by id, month(date)
  having date = max(date)
;
quit;

Doesn't this fail if you have dates in January (or any month) in two different years?

--
Paige Miller
mkeintz
PROC Star

Apparently your data is sorted by ID/DATE, and you want the latest date in each month.

Then:

 

data have;
  input ID DATE :date9. ;
  format date date9. ;
datalines;
2  01JAN2020
2  02JAN2020
2  03JAN2020
2  22FEB2020
2  24FEB2020
3  01JAN2020
3 02JAN2020
3 22FEB2020
3 24FEB2020
4 01JAN2020
4 02JAN2020
4 22FEB2021
4 23FEB2021
4 25MAR2021
4 26MAR2021
4 27MAR2021
run;
data want (drop=nxt_date);
  set have (keep=id);
  by id;
  merge have    have(firstobs=2 keep=date rename=(date=nxt_date));
  if last.id=1 or intck('month',date,nxt_date)>0;
run;

 

The self-merge in the merge statement provides a way to look ahead one observation (using the FIRSTOBS=2 option), and compare the current date to the following date (nxt_date).

 

BTW, you included a 03jan2020 in your data for ID=1.  So shouldn't that be the expected value for your result?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Patrick
Opal | Level 21

Here a SAS SQL option.

data have;
 input ID DATE:date9.;
 format date date9.;
 datalines;
2 01JAN2020
2 02JAN2020
2 03JAN2020
2 22FEB2020
2 24FEB2020
3 01JAN2020
3 02JAN2020
3 22FEB2020
3 24FEB2020
4 01JAN2020
4 02JAN2020
4 22FEB2021
4 23FEB2021
4 25MAR2021
4 26MAR2021
4 27MAR2021
;

proc sql;
  create table want as
  select id, max(date) as date format=date9.
  from have
  group by id, intnx('month',date,0,'b')
  ;
quit;

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1168 views
  • 0 likes
  • 5 in conversation