Hello
Lets say that have data set has wide structure than I know how to calculate -First_Smallest,Second_Smallest,Third_Smallest..
Let's say that have data set has ling structure.
What is the way to calculate-First_Smallest,Second_Smallest,Third_Smallest from this data set?
(Without change the structure from long to wide)?
Note-
First_Smallest is the value in smallest observation
Second_Smallest is the value in second smallest observation
Third_Smallest is the value in third smallest observation
data have1;
input CustID VAR $ mon1 mon2 mon3 mon4 mon5 mon6;
cards;
1 x 10 20 30 40 50 60
1 W 60 50 40 30 20 10
1 Z 30 30 30 0 0 0
2 x 30 20 10 10 40 0
;
Run;
data want1;
set have1;
First_Smallest=smallest(1,mon1,mon2,mon3,mon4,mon5,mon6) ;
Second_Smallest=smallest(2,mon1,mon2,mon3,mon4,mon5,mon6) ;
Third_Smallest=smallest(3,mon1,mon2,mon3,mon4,mon5,mon6) ;
Run;
data have2;
input CustID VAR $ month $ value;
cards;
1 x mon1 10
1 x mon2 20
1 x mon3 30
1 x mon4 40
1 x mon5 50
1 x mon6 60
1 w mon1 60
1 w mon2 50
1 w mon3 40
1 w mon4 30
1 w mon5 20
1 w mon6 10
1 z mon1 30
1 z mon2 30
1 z mon3 30
1 z mon4 0
1 z mon5 0
1 z mon6 0
2 x mon1 30
2 x mon2 20
2 x mon3 10
2 x mon4 10
2 x mon5 40
2 x mon6 0
;
run;
Answered in your earlier thread https://communities.sas.com/t5/SAS-Programming/calculate-second-smallest-via-proc-summary/m-p/932430...
Also PROC RANK (which you also know about)
Terminology: "First_Smallest is the value in smallest observation" is wrong usage; it is the value of a particular variable that is smallest. There is no such thing as a "smallest observation".
How about this?
proc sort data=have2;
by CustID Var;
run;
proc rank data=have2 out=want(where=(rank <= 3)) /* keep top 3 ranks */
ties=low;
by CustID Var;
var value;
ranks Rank;
run;
proc sort data=want;
by CustID Var Rank;
run;
proc print data=want;run;
If
Then: there is a single DATA step solution:
data want2 (drop=v);
array tmpval {6} _temporary_ (6*%sysfunc(constant(big)));
/*First pass */
do v=1 by 1 until (last.var);
set have2 ;
by custid var notsorted;
tmpval{v}=value;
end;
call sortn(of tmpval{*});
/*Second pass */
do until (last.var);
set have2 ;
by custid var notsorted;
rank=whichn(value,of tmpval{*});
output;
tmpval{rank}=constant('big');
end;
run;
This code accommodates up to 6 obs per custid/var. The initialization of the TMPVAL array to a set of %sysfunc(CONSTANT(BIG)) values helps with underpopulated (i.e. N<6) groups, because the sorting of values (between the first pass and second pass) keeps all observed values as the leftmost array elements (unlike initializing the array to missing values). That permits the WHICHN function to correctly determine the rank. That is also the reason why, once a rank is determined, the corresponding array element is reset to CONSTANT('BIG'), in preparation for the next group.
For the sake of future readers who come to this thread, you can read about three different methods for finding the largest (or smallest) values in each group: Display the largest values for each group - The DO Loop (sas.com)
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.