for rows age >= 24 months, sort by AGE and KEY, keep record closest to Month 24. if mutiple KEYS at same time point, keep highest KEY.
ID | A | B | Key | D | Age |
103 | 53 | 1 | 0 | 88 | 2 |
103 | 19 | 1 | 0 | 88 | 2 |
103 | 53 | 2 | 1 | 88 | 9 |
103 | 53 | 4 | 5 | 88 | 17 |
103 | 19 | 3 | 5 | 88 | 17 |
103 | 97 | 3 | 7 | 88 | 48 |
103 | 97 | 4 | 9 | 88 | 60 |
103 | 97 | 6 | 9 | 88 | 96 |
103 | 105 | 1 | 9 | 88 | 108 |
103 | 97 | 7 | 9 | 88 | 132 |
103 | 97 | 8 | 9 | 88 | 156 |
Yeah, missed that.
So we sort with a WHERE first:
proc sort
data=have (where=(age ge 24))
out=sorted
;
by id age descending key;
run;
data want;
set sorted;
by id;
if first.id;
run;
What does your desired result look like here?
First, I'd build a value that is
diff = abs(24 - age)
;
Then, I would sort
by id descending diff descending key;
and, in a final data step, do
by id;
if first.id;
@PeterClemmensen @Kurt_Bremser
I have picked up a better example, use this data pls. (this applies to records >=24 only )
There are two rows with age 48, however I should keep one record close to 24 (so both rows with age 48 satisfy this condition) but then I have multiple keys (7,9) at same time point (48), so I need to keep the record with highest key value.
if its same key at different time points like 48 and 96 it doesnt really matter as I am going to keep the closest to 24 which still has the same key as the next.
ID | A | B | Key | D | Age |
103 | 53 | 1 | 0 | 88 | 2 |
103 | 19 | 1 | 0 | 88 | 2 |
103 | 53 | 2 | 1 | 88 | 9 |
103 | 53 | 4 | 5 | 88 | 17 |
103 | 19 | 3 | 5 | 88 | 17 |
103 | 97 | 3 | 7 | 88 | 48 |
103 | 97 | 4 | 9 | 88 | 48 |
103 | 97 | 6 | 9 | 88 | 96 |
103 | 105 | 1 | 9 | 88 | 108 |
103 | 97 | 7 | 9 | 88 | 132 |
103 | 97 | 8 | 9 | 88 | 156 |
Those 48 values are irrelevant, as 17 is closer to 24.
Yeah, missed that.
So we sort with a WHERE first:
proc sort
data=have (where=(age ge 24))
out=sorted
;
by id age descending key;
run;
data want;
set sorted;
by id;
if first.id;
run;
Please post your source data in usable form (data step with datalines, do not skip this!), the code you ran, and indicate where the result does not meet your expectations.
Thank you very much.
This logic can actually be applied in either case if records < 24 or > 24 just by tweaking it a bit.
Thank you again.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.