Statistical Procedures

Programming the statistical procedures from SAS
BookmarkSubscribeRSS Feed
Toni2
Lapis Lazuli | Level 10

i have 2 variables OURBAND and SURBAND and want use the proc rank 

 

Below are the proc freq of the variables and their ranks

 

I don't understand why for the variable OURBAND the proc rank creates only 1 rank (while has 5 different values) and at the same time the SURBAND with 9 different values has 5 ranks?

 

 

OURBAND
OURBAND Frequency Percent Cumulative Cumulative
Frequency Percent
1 20 0.67 20 0.67
2 92 3.07 112 3.73
3 42 1.4 154 5.13
4 7 0.23 161 5.37
5 2839 94.63 3000 100
SURBAND
SURBAND Frequency Percent Cumulative Cumulative
Frequency Percent
1 998 33.27 998 33.27
2 13 0.43 1011 33.7
3 32 1.07 1043 34.77
4 1102 36.73 2145 71.5
5 185 6.17 2330 77.67
6 248 8.27 2578 85.93
7 330 11 2908 96.93
8 37 1.23 2945 98.17
9 55 1.83 3000 100
Rank for Variable OURBAND
rank1 Frequency Percent Cumulative Cumulative
Frequency Percent
0 3000 100 3000 100
Rank for Variable SURBAND
rank2 Frequency Percent Cumulative Cumulative
Frequency Percent
0 998 33.27 998 33.27
3 1147 38.23 2145 71.5
7 433 14.43 2578 85.93
8 330 11 2908 96.93
9 92 3.07 3000 100

 

proc rank data=test_1 group=10 out=check_rank ties=low;
var &varnum;
ranks rank1-rank&vmcnt; 
run;

proc freq data=check_rank;
table ourband surband rank1 rank2;
run;
7 REPLIES 7
PaigeMiller
Diamond | Level 26

I don't know why PROC RANK is doing this either. You didn't show us the original data in data set TEST_1. A useful debugging technique is for you to look at data set TEST_1 with your own eyes, and see if you can figure it out.

 

If that doesn't help, then please show us (a portion of) the original data in data set TEST_1 as SAS data step code, which you can type in yourself, or by following these instructions.

 

@Toni2 don't make us ask to see your data. From now on, please just show it to us without us asking, in the form requested above. Thanks!

--
Paige Miller
Toni2
Lapis Lazuli | Level 10

thanks. I can't see anything strange in the TEST_1. TEST_1 contains RAW data

 

The data step which i used to create the TEST_1 is the below. i have not made any calculation

 

data test_1;
set input.raw_data;
keep &listvar;
run;

 Below is a small part of the TEST_1

 

SURBAND OURBAND
9 3
6 5
4 5
4 5
9 5
4 5
4 5
7 5
5 5
7 5
7 5
6 3
4 5
6 5
6 2
5 5
3 3
7 5
7 5
7 5
6 5
7 5
5 5
9 3
5 5
5 5
PaigeMiller
Diamond | Level 26

PROC RANK is usually for continuous values. You have discrete values of OURBAND and SURBAND. It really doesn't make sense to try to rank these.

 

If you want to know if there are more 5s than 1s in a variable that takes discrete values, you would use PROC FREQ and not PROC RANK.

--
Paige Miller
Rick_SAS
SAS Super FREQ

I think you need to get rid of the GROUPS=10 option. That option is telling the procedure to combine values that have different ranks into a single group, which is not what you want.

 

For more about the GROUP= option, see https://blogs.sas.com/content/sgf/2019/07/19/how-the-rank-procedure-calculates-ranks-with-groups-and... 

 

Look at the following example to see what the GROUPS=10 option is doing for your data:

data A;
input outband count;
do i = 1 to count;
Cnt + 1;
   output;
end;
drop i;
datalines;
1	20
2	92
3	42
4	7
5	2839
;

proc rank data=A out=rankLow ties=LOW;
   var outband;
   ranks rankLow;
run;
Title "Ranking";
proc sgplot data=rankLow;
scatter x=Cnt y=rankLow;
run;

proc rank data=A out=rankLowGroup ties=LOW GROUPS=10;
   var outband;
   ranks rankLow;
run;
Title "GROUPS=10 Option";
proc sgplot data=rankLowGroup;
scatter x=Cnt y=rankLow;
run;

 

Toni2
Lapis Lazuli | Level 10
thanks for your response. I tried to run but get the below error (sorry i am not very experienced)

ERROR: Invalid DO loop control information, either the INITIAL or TO expression is missing or the BY expression is missing, zero, or invalid.
Rick_SAS
SAS Super FREQ

I assume you tried to cut and paste my example and accidentally did not copy all of the program. Try it again and be more careful.

StatDave
SAS Super FREQ
Intuitively speaking, OURBAND has only 161 observations of the 3000 that are less than 5. The 161-st value (in order) is only at the 5th percentile. If you run: proc univariate; var ourband; run; you will see that the 10-th percentile is already 5. So, all of those 161 observations will be in the first of the deciles you requested with GROUPS=10 (which is give rank value 0). Also, observations 162 through 300 (which have value 5) will also be in the first decile and since those values are tied with all of the remaining observations (also with value 5), then everything gets put in the first decile by GROUPS=10.

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1399 views
  • 6 likes
  • 4 in conversation