Hi,
My requirement is i want to display max1 and max2 values from each group using grouping in datastep. Here is my dataset:
data devil;
infile datalines dlm=' ' dsd;
input pincode order num_total ;
datalines;
12345 1 99
23451 1 1
34512 1 10
45123 2 10
51234 2 13
67890 2 10
78901 3 11
89012 3 10
90123 3 20
;
run;
Expected output:
pincode order num_total
12345 1 99
34512 1 10
51234 2 13
45123 2 10
67890 2 10
90123 3 20
78901 3 11
Here is one way. But why do you want to use a data step?
data devil;
infile datalines dlm=' ' dsd;
input pincode order num_total ;
datalines;
12345 1 99
23451 1 1
34512 1 10
45123 2 10
51234 2 13
67890 2 10
78901 3 11
89012 3 10
90123 3 20
;
run;
proc rank data=devil descending out=temp ties=dense;
by order;
var num_total;
ranks rank;
run;
data want(drop=rank);
set temp;
where rank in (1,2);
run;
proc sort data=devil;
by order descending num_total;
run;
data want;
set devil;
by order;
if first.order then seq=0;
seq+1;
if seq<=2;
run;
This may or may not handle ties the way you want, but you didn't really say what you want to do in the case of ties.
Here is one way. But why do you want to use a data step?
data devil;
infile datalines dlm=' ' dsd;
input pincode order num_total ;
datalines;
12345 1 99
23451 1 1
34512 1 10
45123 2 10
51234 2 13
67890 2 10
78901 3 11
89012 3 10
90123 3 20
;
run;
proc rank data=devil descending out=temp ties=dense;
by order;
var num_total;
ranks rank;
run;
data want(drop=rank);
set temp;
where rank in (1,2);
run;
Another Data Step Version. This solution can work for more than 3 observations in each Group. Sorting is made to sort by ORDER and in descending order of NUM_TOTAL. It leads to choosing the first two observations giving the First Max and Second Max.
data devil; input pincode order num_total ; datalines; 12345 1 99 23451 1 1 34512 1 10 45123 2 10 51234 2 13 67890 2 10 78901 3 11 89012 3 10 90123 3 20 ; run; proc sort data = devil out = temp; by order descending num_total ; run; data want; do i = 1 by 1 until(last.order); set temp; by order; if i in(1,2) then output; end; drop i; run; Obs pincode order num_total 1 12345 1 99 2 34512 1 10 3 51234 2 13 4 45123 2 10 5 90123 3 20 6 78901 3 11
Hi,
I think that to handle the "order=2" case you have to change the datastep a little bit:
data want;
_tmp_ = 0;
do i = 1 by 1 until(last.order);
set temp;
by order descending num_total;
_tmp_ + last.num_total;
put _all_;
if _tmp_ = 2 then _leave_ = i;
end;
put;
put _all_;
do i = 1 by 1 until(last.order);
set temp;
by order;
if i <= _leave_ then output;
end;
drop i _tmp_ _leave_;
run;
all the best
Bart
Hi Bart,
Your suggestion gets 3 rows for ORDER = 2. In this case, we get MAX1 = 13 and MAX2 = 10. The next value which is MAX3, is also 10.
The OP is interested to get MAX1 and MAX2. Hence I stopped with the 2 values for each order.
If I missing to see what you are telling, kindly explain.
Regards,
DataSP
Hi DataSP,
I suggested the extension looking at OP's expected result. It looks like OP wants all results in case of ties.
all the best
Bart
Hi Bart,
Somehow I missed OP's requirement for ties. The OP may use your suggested code for his requirements.
I tried to use one DO-LOOP to achieve such a result. So I have modified test data set to check for tie-conditions.
data devil; input pincode order num_total ; datalines; 12345 1 99 23451 1 1 34512 1 10 45123 2 10 51234 2 13 67890 2 10 78901 3 15 89012 3 15 90123 3 10 90123 3 10 90123 3 5 ; run; proc sort data = devil out = temp; by order descending num_total ; run; data want; t = 0; do until(last.order); set temp; by order; if t = 0 then do; t = num_total; count = 1; output; end; else if t = num_total then output; else if t > num_total then do; t = num_total; if count > 2 then leave; else if count = 1 then do; count + 1; output; end; end; end; drop count t; run; Obs pincode order num_total 1 12345 1 99 2 34512 1 10 3 51234 2 13 4 45123 2 10 5 67890 2 10 6 78901 3 15 7 89012 3 15 8 90123 3 10 9 90123 3 10
Best regards,
DataSP
Apparently, I missed the requirement about handling ties, which was not stated, but appears in the output clearly.
In that case, I endorse the solution by @PeterClemmensen using PROC RANK. I would not advise programming this yourself in a data step, it's already done for you in PROC RANK.
data devil;
infile datalines dlm=' ' dsd;
input pincode order num_total ;
datalines;
12345 1 99
23451 1 1
34512 1 10
45123 2 10
51234 2 13
67890 2 10
78901 3 11
89012 3 10
90123 3 20
;
run;
proc sort data=devil out=temp;
by order descending num_total;
run;
data want;
set temp;
by order descending num_total notsorted;
if first.order then n=0;
if first.num_total then do;n+1;if n in (1 2) then output;end;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.