Obsidian | Level 7

## Dataset Transformation Problem

Hi All,

I am looking for a solution to this problem using Datastep. Using SAS transform have dataset to want dataset. Output all observations with maximum price by item (do not remove duplicates).

have:

 id item price 1 Apple 98 1 Apple 99 1 Apple 100 1 Apple 100 2 Banana 30 2 Banana 45 2 Banana 48 3 Orange 76 3 Orange 78 3 Orange 78 3 Orange 83 3 Orange 83

want:

 id item price 1 Apple 100 1 Apple 100 2 Banana 48 3 Orange 83 3 Orange 83

Dataset code:

Data have;

input id item \$ price;

datalines;

1 Apple 98

1 Apple 99

1 Apple 100

1 Apple 100

2 Banana 30

2 Banana 45

2 Banana 48

3 Orange 76

3 Orange 78

3 Orange 78

3 Orange 83

3 Orange 83

;

run;

proc print; run;

Solution using:

PROC SQL:

proc sql;

create table want as

select * from have as a

where id in (

select id from have as b

group by id

having max(b.price) = a.price

);

quit;

proc print; run;

I am looking for a solution using Datastep.

1 ACCEPTED SOLUTION

Accepted Solutions
Opal | Level 21

## Re: Dataset Transformation Problem

You could also use Proc Rank.

``````Data have;
input id item \$ price;
datalines;
1 Apple 98
1 Apple 99
1 Apple 100
1 Apple 100
2 Banana 30
2 Banana 45
2 Banana 48
3 Orange 76
3 Orange 78
3 Orange 78
3 Orange 83
3 Orange 83
;
proc rank data=have out=want(where=(rank=1)) ties=dense descending;
by id;
ranks rank;
var price;
run;
proc print data=want;
run;``````

4 REPLIES 4
Super User

## Re: Dataset Transformation Problem

Why a data step? "Groups of related values" really isn't the main strength of the data step. Data step is more for record by record processing.

It would likely be easier to use a Proc Sort before the data step. Note that "group by" is an implied sort behind the scenes in SQL.

Lapis Lazuli | Level 10

## Re: Dataset Transformation Problem

``````proc sort data=have;
by id item descending price;
run;
data want;
set have;
by id item;
retain temp;
if first.item then temp=price;
if price=temp;
proc print;run;
``````
Opal | Level 21

## Re: Dataset Transformation Problem

You could also use Proc Rank.

``````Data have;
input id item \$ price;
datalines;
1 Apple 98
1 Apple 99
1 Apple 100
1 Apple 100
2 Banana 30
2 Banana 45
2 Banana 48
3 Orange 76
3 Orange 78
3 Orange 78
3 Orange 83
3 Orange 83
;
proc rank data=have out=want(where=(rank=1)) ties=dense descending;
by id;
ranks rank;
var price;
run;
proc print data=want;
run;``````

Super User

## Re: Dataset Transformation Problem

Use a double DO loop:

``````data want;
do until (last.item);
set have;
by id item;
maxprice = max(maxprice,price);
end;
do until (last.item);
set have;
by id item;
if price = maxprice then output;
end;
drop maxprice;
run;
``````
Discussion stats
• 4 replies
• 817 views
• 5 likes
• 5 in conversation