BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
elif
Fluorite | Level 6

Hello,

In the “allvar” dataset, variables divi, rd, and sin take values of either 0 or 1; variable divo takes values -1 or 0. I am using this data set to create portfolios for each date (newdatadate in my case).

I am using PROC RANK and group them into 5 before creating portfolios.

 

My Code:

 

OPTIONS MPRINT SYMBOLGEN;

%LET avars = divi divo rd sin ;    * Note: (I have more than 60 variables here);

%LET n_avars=%SYSFUNC(countw(&avars));

 

%PUT &n_avars;

 

%LET grp = 5;

PROC SORT DATA = allvar;

BY newdatadate;

RUN;

 

PROC RANK DATA=allvar OUT=allvar_ranked group=&grp; 

BY  newdatadate;  VAR &avars;

RUN;

 

I think, in the allvar_ranked dataset, variables divi, divo, rd, and sin should have only two different ranked values since they already had taken two values before they were ranked. However, it’s not the case.

 

When I run the following (before running proc rank):

proc sql; select distinct divi from allvar; quit; * Values: ., 0, 1         ;

proc sql; select distinct divo from allvar; quit; * Values: ., 0,-1         ;

proc sql; select distinct rd   from allvar; quit; * Values: 0, 1            ;

proc sql; select distinct sin  from allvar; quit; * Values: ., 0, 1         ;

 

When I run the same in the ranked dataset:

proc sql; select distinct divi from allvar_ranked; quit; * Ranked Values: ., 0, 2, 3, 4    ;

proc sql; select distinct divo from allvar_ranked; quit; * Ranked Values: ., 0, 1, 2, 3    ;

proc sql; select distinct rd   from allvar_ranked; quit; * Ranked Values: 1, 2, 4          ;

proc sql; select distinct sin  from allvar_ranked; quit; * Ranked Values: ., 2, 4          ;

 

I was using this command previously and have never encountered this problem.

How can I fix this?

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
elif
Fluorite | Level 6

My goal is to create portfolios for each data date based on these rankings/groups so I will be able to compare the best group and the worst group for each variable.

Hence, for each date, I will assign 5 ranks to each variable separately.

I just wanted to handle the ranking procedure in one step, but I guess I should run proc rank for the rest of my variables with 5 groups.

Then, I  can use proc rank again on the ranked dataset with two groups for only divi, divo, rd, and sin.

 

 

 

View solution in original post

7 REPLIES 7
PGStats
Opal | Level 21

I don't get it. How would you want proc rank to generate 5 groups out of two values?

PG
elif
Fluorite | Level 6

Hi,

I have other variables along with these four. These are the only variables that take two values.

ballardw
Super User

@elif wrote:

Hi,

I have other variables along with these four. These are the only variables that take two values.


I think that you may want to include your BY variable in the summary because that effects things;

Example of some data ranked with and without by variable.

data junk;
   do i= 1 to 50;
      byvar = mod(i,10);
      x= rand('integer',2) -1;
      y= rand('integer',2) -1;
      z= rand('integer',2) -2;
      r= rand('integer',2) -2;
      output;
   end;
run;

proc sort data=junk;
   by byvar;
run;

proc freq data=junk;
   title "Before ranking";
   tables x y z r;
run;

proc rank data=junk out=ranked1 group=5;
   var x y z r;
run;
proc freq data=ranked1;
   title "After ranking without by variable";
   tables x y z r;
run;

proc rank data=junk out=ranked2 group=5;
   by byvar;
   var x y z r;
run;

proc freq data=ranked2;
   title "After ranking with by variable";
   tables byvar *(x y z r) / list;
run;title;

Snap shot result: Without BY variable the ranks are two values for each variable.

With By variable the ranks vary across the entire data set BUT within each by group there are only two values but the actual rank values can vary for different values of the by variable.

From the same starting data.

PGStats
Opal | Level 21

But proc rank treats each variable separately. It doesn't combine variables to create groups.

What is your goal? Maybe you should be looking for clustering procedures.

PG
elif
Fluorite | Level 6

My goal is to create portfolios for each data date based on these rankings/groups so I will be able to compare the best group and the worst group for each variable.

Hence, for each date, I will assign 5 ranks to each variable separately.

I just wanted to handle the ranking procedure in one step, but I guess I should run proc rank for the rest of my variables with 5 groups.

Then, I  can use proc rank again on the ranked dataset with two groups for only divi, divo, rd, and sin.

 

 

 

PGStats
Opal | Level 21

Suppose that variable X is your performance criteria for determining which item (stock?) belongs to the best group and the worst group, seems to me that X is the variable that should be ranked

 

proc rank data=myData out=myRanks fraction;
by date;
var X;
ranks Xrank;
run;

then you could explore the relationship between Xrank and your explanatory variables. Proc hpsplit could help in that exploration, too determine which variable values are associated with low and high ranks.

 

hth

PG
elif
Fluorite | Level 6

Update:

I have solved the problem after learning more about this procedure via your help.

Thank you so much!

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1353 views
  • 5 likes
  • 3 in conversation