Hi All,
I am looking for a solution to this problem using Datastep. Using SAS transform have dataset to want dataset. Output all observations with maximum price by item (do not remove duplicates).
have:
id |
item |
price |
1 |
Apple |
98 |
1 |
Apple |
99 |
1 |
Apple |
100 |
1 |
Apple |
100 |
2 |
Banana |
30 |
2 |
Banana |
45 |
2 |
Banana |
48 |
3 |
Orange |
76 |
3 |
Orange |
78 |
3 |
Orange |
78 |
3 |
Orange |
83 |
3 |
Orange |
83 |
want:
id |
item |
price |
1 |
Apple |
100 |
1 |
Apple |
100 |
2 |
Banana |
48 |
3 |
Orange |
83 |
3 |
Orange |
83 |
Dataset code:
Data have;
input id item $ price;
datalines;
1 Apple 98
1 Apple 99
1 Apple 100
1 Apple 100
2 Banana 30
2 Banana 45
2 Banana 48
3 Orange 76
3 Orange 78
3 Orange 78
3 Orange 83
3 Orange 83
;
run;
proc print; run;
Solution using:
PROC SQL:
proc sql;
create table want as
select * from have as a
where id in (
select id from have as b
group by id
having max(b.price) = a.price
);
quit;
proc print; run;
I am looking for a solution using Datastep.
You could also use Proc Rank.
Data have;
input id item $ price;
datalines;
1 Apple 98
1 Apple 99
1 Apple 100
1 Apple 100
2 Banana 30
2 Banana 45
2 Banana 48
3 Orange 76
3 Orange 78
3 Orange 78
3 Orange 83
3 Orange 83
;
proc rank data=have out=want(where=(rank=1)) ties=dense descending;
by id;
ranks rank;
var price;
run;
proc print data=want;
run;
Why a data step? "Groups of related values" really isn't the main strength of the data step. Data step is more for record by record processing.
It would likely be easier to use a Proc Sort before the data step. Note that "group by" is an implied sort behind the scenes in SQL.
proc sort data=have;
by id item descending price;
run;
data want;
set have;
by id item;
retain temp;
if first.item then temp=price;
if price=temp;
proc print;run;
You could also use Proc Rank.
Data have;
input id item $ price;
datalines;
1 Apple 98
1 Apple 99
1 Apple 100
1 Apple 100
2 Banana 30
2 Banana 45
2 Banana 48
3 Orange 76
3 Orange 78
3 Orange 78
3 Orange 83
3 Orange 83
;
proc rank data=have out=want(where=(rank=1)) ties=dense descending;
by id;
ranks rank;
var price;
run;
proc print data=want;
run;
Use a double DO loop:
data want;
do until (last.item);
set have;
by id item;
maxprice = max(maxprice,price);
end;
do until (last.item);
set have;
by id item;
if price = maxprice then output;
end;
drop maxprice;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.