I have a data set of the following type:
I want to have a single row per ID in such a way that it picks the maximum value of the status at its first occurrence and the month corresponding to the row that has max value of status.
The values of the remaining variables should be picked from row 1.
The new data set should look like
ID Status Amount Month
125 2 1200 2
127 2 1234 2
130 0 212 5
Thanks!
Thanks!
A datastep works well in this scenario.
data want;
set have;
by id;
retain new_status;
if first.id then do;
new_amount=amount;
new_status=status;
new_month=month;
end;
if status > new_status then do;
new_status=status;
new_month=month;
end;
if last.id then output;
drop status account month;
run;
Here is a SQL approach:
Proc sql;
create table want as
select * from
(select * from have group by id having status=max(status))
group by id, status having month=min(month)
;
quit;
Haikuo
PROC MEAN/SUMMARY has features that are specifically designed for this problem.
borrowed DN's data. the code bellow is inefficient, but it is easy to understand:
data odd;
input id $ status amount month;
cards;
125 1 1200 1
125 2 1300 2
125 1 1300 3
125 2 1300 4
127 0 1234 1
127 2 1000 2
127 2 1001 3
127 0 999 4
130 0 1212 5
130 0 1200 6
;;;;
data temp1(keep=id amount);
set odd;
by id notsorted;
if first.id;
run;
proc sort data=odd out=temp2;
by id descending status month;
data temp3(drop=amount);
set temp2;
by id;
if first.id;
run;
data want;
merge temp1 temp3;
by id;
proc print;run;
Obs id amount status month
1 125 1200 2 2
2 127 1234 2 2
3 130 1212 0 5
Thanks!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.