Ranking by a secondary variable

Reply
Contributor
Posts: 35

Ranking by a secondary variable

I have a question regarding ranking (proc rank) by a secondary variable (i.e. dealing with ties):

I have branches that I want ranked based on a variable (var1), but I want the ties resolved based on a secondary variable (var2). This idea is very similar to sorting by more than one variable (i.e. have two variables in the class statement). However I cannot seem to find something like this with proc rank.

Currently I am using proc rank to rank based on var1, then I have a data step to manually sort out the ties based on the secondary variable.

I asked on the Twitter group, but the guys there don't seem to understand my problem.

I would be glad for any help

Super Contributor
Posts: 644

Re: Ranking by a secondary variable

Proc Rank treats each variable to be ranked as a separate exercise, which is why you are having to do a follow up data step (presumably with a sort as well) to mediate the ties.

If you used a view before running proc rank that created a single variable out of the two you currently use you could rank on that in one pass (always assuming that ties are not a problem in the second variable). Some thing like this (assuming var1 and var2 are numeric and less than 999,999,999,999)

Proc SQL ;

     Create view temp_rank as

          Select  Put (var1, z12.) || ' ' || Put (var2, z12.) as new_var

               ,     *

          From have

               ;

Quit ;

Proc rank data = temp_rank ;

     var new_var ;

run ;

[untested code - add in whatever Rank options you are using]

Richard

Respected Advisor
Posts: 3,799

Re: Ranking by a secondary variable

I'm not sure I understand either.  Show example data, "I have this and I need that" works well usually makes the need pretty clear.

Given what I think I understand it sounds like maybe you should just sort by the two variables and create and index variable.  For example... note that Mary and William are still tied.

proc sort data=sashelp.class out=class;
   by age height;
   run;
data class;
   set class;
   by age height;
  
if first.height then _index_ + 1;
  
run;
proc print;
  
run;
Contributor
Posts: 35

Re: Ranking by a secondary variable

Thnx for the reply. i do appreciate it.

Ok, so lets hope this explains my problem:

data branches;

input branchcode type Volume Rate ;

cards;

1001    1    10    0.1

1012    1    20    0.3

1291    1    15    0.3

2008    2    10    0.2

2343    2    20    0.2

2455    2    20    0.4

;

run;

proc sort data=branches;

by type Branchcode Volume decending Rate;

run;

proc rank data=branches out=branches_ranked ties=low descending;

by type;

var rate;

ranks branch_rank;

run;

proc sort data=branches_ranked;

by type branch_rank decending Volume;

run;

proc print data=branches_ranked;

run;

This gives output :

                branch_
                    ObsbranchcodetypeVolumeRate  rank

                     1    1012     1   20   0.3   1
                     2    1291     1   15   0.3   1
                     3    1001     1   10   0.1   3
                     4    2455     2   20   0.4   1
                     5    2343     2   20   0.2   2
                     6    2008     2   10   0.2   2

So the problem is the ties...What I want is actually to resolve the ties using the Volume column. Hence branch 1291 would get a rank 2, based on the volume and branch 2008 will get a rank 3 based on its volume. This is why I stated that this is similar to sorting based on two variables. I wrote a data step (as I stated before) which does exactly what I want. But from your answers I gather there is no way to do this in proc rank itself.

Thnx for the answers though, I am going to try what you suggested anyway

Ask a Question
Discussion stats
  • 3 replies
  • 695 views
  • 1 like
  • 3 in conversation