🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Fluorite | Level 6

## Problem with PROC RANK

Hello,

In the “allvar” dataset, variables divi, rd, and sin take values of either 0 or 1; variable divo takes values -1 or 0. I am using this data set to create portfolios for each date (newdatadate in my case).

I am using PROC RANK and group them into 5 before creating portfolios.

My Code:

OPTIONS MPRINT SYMBOLGEN;

%LET avars = divi divo rd sin ;    * Note: (I have more than 60 variables here);

%LET n_avars=%SYSFUNC(countw(&avars));

%PUT &n_avars;

%LET grp = 5;

PROC SORT DATA = allvar;

RUN;

PROC RANK DATA=allvar OUT=allvar_ranked group=&grp;

BY  newdatadate;  VAR &avars;

RUN;

I think, in the allvar_ranked dataset, variables divi, divo, rd, and sin should have only two different ranked values since they already had taken two values before they were ranked. However, it’s not the case.

When I run the following (before running proc rank):

proc sql; select distinct divi from allvar; quit; * Values: ., 0, 1         ;

proc sql; select distinct divo from allvar; quit; * Values: ., 0,-1         ;

proc sql; select distinct rd   from allvar; quit; * Values: 0, 1            ;

proc sql; select distinct sin  from allvar; quit; * Values: ., 0, 1         ;

When I run the same in the ranked dataset:

proc sql; select distinct divi from allvar_ranked; quit; * Ranked Values: ., 0, 2, 3, 4    ;

proc sql; select distinct divo from allvar_ranked; quit; * Ranked Values: ., 0, 1, 2, 3    ;

proc sql; select distinct rd   from allvar_ranked; quit; * Ranked Values: 1, 2, 4          ;

proc sql; select distinct sin  from allvar_ranked; quit; * Ranked Values: ., 2, 4          ;

I was using this command previously and have never encountered this problem.

How can I fix this?

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Fluorite | Level 6

## Re: Problem with PROC RANK

My goal is to create portfolios for each data date based on these rankings/groups so I will be able to compare the best group and the worst group for each variable.

Hence, for each date, I will assign 5 ranks to each variable separately.

I just wanted to handle the ranking procedure in one step, but I guess I should run proc rank for the rest of my variables with 5 groups.

Then, I  can use proc rank again on the ranked dataset with two groups for only divi, divo, rd, and sin.

7 REPLIES 7
Opal | Level 21

## Re: Problem with PROC RANK

I don't get it. How would you want proc rank to generate 5 groups out of two values?

PG
Fluorite | Level 6

## Re: Problem with PROC RANK

Hi,

I have other variables along with these four. These are the only variables that take two values.

Super User

## Re: Problem with PROC RANK

@elif wrote:

Hi,

I have other variables along with these four. These are the only variables that take two values.

I think that you may want to include your BY variable in the summary because that effects things;

Example of some data ranked with and without by variable.

```data junk;
do i= 1 to 50;
byvar = mod(i,10);
x= rand('integer',2) -1;
y= rand('integer',2) -1;
z= rand('integer',2) -2;
r= rand('integer',2) -2;
output;
end;
run;

proc sort data=junk;
by byvar;
run;

proc freq data=junk;
title "Before ranking";
tables x y z r;
run;

proc rank data=junk out=ranked1 group=5;
var x y z r;
run;
proc freq data=ranked1;
title "After ranking without by variable";
tables x y z r;
run;

proc rank data=junk out=ranked2 group=5;
by byvar;
var x y z r;
run;

proc freq data=ranked2;
title "After ranking with by variable";
tables byvar *(x y z r) / list;
run;title;
```

Snap shot result: Without BY variable the ranks are two values for each variable.

With By variable the ranks vary across the entire data set BUT within each by group there are only two values but the actual rank values can vary for different values of the by variable.

From the same starting data.

Opal | Level 21

## Re: Problem with PROC RANK

But proc rank treats each variable separately. It doesn't combine variables to create groups.

What is your goal? Maybe you should be looking for clustering procedures.

PG
Fluorite | Level 6

## Re: Problem with PROC RANK

My goal is to create portfolios for each data date based on these rankings/groups so I will be able to compare the best group and the worst group for each variable.

Hence, for each date, I will assign 5 ranks to each variable separately.

I just wanted to handle the ranking procedure in one step, but I guess I should run proc rank for the rest of my variables with 5 groups.

Then, I  can use proc rank again on the ranked dataset with two groups for only divi, divo, rd, and sin.

Opal | Level 21

## Re: Problem with PROC RANK

Suppose that variable X is your performance criteria for determining which item (stock?) belongs to the best group and the worst group, seems to me that X is the variable that should be ranked

``````proc rank data=myData out=myRanks fraction;
by date;
var X;
ranks Xrank;
run;``````

then you could explore the relationship between Xrank and your explanatory variables. Proc hpsplit could help in that exploration, too determine which variable values are associated with low and high ranks.

hth

PG
Fluorite | Level 6

## Re: Problem with PROC RANK

Update:

I have solved the problem after learning more about this procedure via your help.

Thank you so much!

Discussion stats
• 7 replies
• 1228 views
• 5 likes
• 3 in conversation