BookmarkSubscribeRSS Feed
wh95
Fluorite | Level 6

I am having trouble ranking a variable within groups. 

Here is my code:

proc rank data=MONTHRANK out=FLAGRANK ties=dense; /* Ranking UseFlag within each month with TIES=DENSE */
	by ID TYPE descending MonthGroup NumUseFlag;
    var NumUseFlag;
    ranks FlagRank;
run;

 

Here is the initial table

data MONTHRANK;
  length ID $12 Type $3 AdjustmentCode MonthGroup MonthRank NumUseFlag 8
  ;
  infile datalines dsd truncover;
  input ID -- NumUseFlag;
datalines;
123AAABBBC,001,,14,1,1
123AAABBBC,001,3,13,1,0
123AAABBBC,001,2,13,2,1
123AAABBBC,001,,13,3,1
123AAABBBC,001,26,12,1,1
123AAABBBC,001,,12,2,1
123AAABBBC,001,,11,1,1
123AAABBBC,001,,10,1,1
;

Here is what I want to output

data MONTHRANK_OUTPUT;
  length ID $12 Type $3 AdjustmentCode MonthGroup MonthRank NumUseFlag FlagRank 8
  ;
  infile datalines dsd truncover;
  input ID -- FlagRank;
datalines;
123AAABBBC,001,,14,1,1,1
123AAABBBC,001,3,13,1,0,1
123AAABBBC,001,2,13,2,1,1
123AAABBBC,001,,13,3,1,2
123AAABBBC,001,26,12,1,1,1
123AAABBBC,001,,12,2,1,2
123AAABBBC,001,,11,1,1,1
123AAABBBC,001,,10,1,1,1
;

Sorry if the code isn't exactly correct, I hope this is clear enough.

The error I am getting is "ERROR: Data set WORK.MONTHRANK is not sorted in ascending sequence. The current BY group has NumUseFlag = 1 and the next BY group has NumUseFlag = 0.", which kind of makes sense to me if there were another adjustment line above the line in the 13th month with the NumUseFlag = 1 instead of 0 and the AdjustmentCode = 26 (this is possible within the rest of the data), but I don't really see why that should necessarily matter and I'm unsure of how to get around the issue.

 

I have tried using descending, but I get the same error just saying it's not sorted in descending sequence. I have also tried using CLASS, but I don't think it works with PROC RANK.

 

9 REPLIES 9
quickbluefish
Barite | Level 11
Did you sort your data before using PROC RANK? In general, you cannot use a BY statement in a PROC or a DATA step if the incoming data are not already sorted in a way that matches your BY statement. Just use PROC SORT first.
wh95
Fluorite | Level 6

I'm sorry, I have edited my original post to also include the AdjustmentCode which is sorted in descending order.

proc sort data=MONTHRANK;
	by MBI PBP descending MonthGroup descending AdjustmentReasonCode;
run;

This is how MONTHRANK is sorted prior to attempting PROC RANK. I believe that there is a case in the data where, for example, a line above the UseFlag = 0 line has AdjustmentCode = 26 and NumUseFlag = 1, so that the order of NumUseFlag goes 1,0,1,1 within MonthGroup = 13.

Tom
Super User Tom
Super User

There is nothing to RANK if you include the VARiable to be ranked as part of the BY grouping.

Remove NUMUSEFLAG from the BY statement.

data MONTHRANK;
  input ID :$12. Type :$3. AdjustmentCode MonthGroup MonthRank NumUseFlag;
datalines;
123AAABBBC 001  . 14 1 1
123AAABBBC 001  3 13 1 0
123AAABBBC 001  2 13 2 1
123AAABBBC 001  . 13 3 1
123AAABBBC 001 26 12 1 1
123AAABBBC 001  . 12 2 1
123AAABBBC 001  . 11 1 1
123AAABBBC 001  . 10 1 1
;

proc sort data=MONTHRANK ;
  by ID TYPE descending MonthGroup NumUseFlag;
run;

/* Ranking UseFlag within each month with TIES=DENSE */
proc rank data=MONTHRANK out=FLAGRANK ties=dense; 
  by ID TYPE descending MonthGroup ;
  var NumUseFlag;
  ranks FlagRank;
run;

Tom_0-1738974094863.png

 

wh95
Fluorite | Level 6
Yes, thank you, I was able to find this answer, but it does not satisfy the question unfortunately. I would like to see the FlagRank column output 1,1,1,2,1,2,1,1 in the table you created. I want the grouping to be within each MonthGroup based on the NumUseFlag, but the data must also be sorted in the format you have shown.
Tom
Super User Tom
Super User

I do not understand.  What sorting do you mean?

If you want to sort data use PROC SORT.

wh95
Fluorite | Level 6

Below, I have modified the table from my first post.

Here is what I want to output

data MONTHRANK_OUTPUT;
  length ID $12 Type $3 AdjustmentCode MonthGroup MonthRank NumUseFlag FlagRank 8
  ;
  infile datalines dsd truncover;
  input ID -- FlagRank;
datalines;
123AAABBBC,001,,14,1,1,1
123AAABBBC,001,3,13,1,0,1
123AAABBBC,001,2,13,2,1,1
123AAABBBC,001,,13,3,1,2
123AAABBBC,001,26,12,1,1,1
123AAABBBC,001,3,12,2,0,1 123AAABBBC,001,,12,3,1,2 123AAABBBC,001,,11,1,1,1 123AAABBBC,001,,10,1,1,1 ;

 All I was saying was that I need AdjustmentCode to be sorted in descending order while also creating the FlagRank based on the NumUseFlag in each MonthGroup. I think that what I am asking to do is honestly just not possible in SAS, at least not without some very involved solution that I don't have enough experience to find. At the end of the day, I need to put the most recent data in the top most row in each MonthGroup. Generally, a higher AdjustmentCode means it is more recent, but certain codes, such as 3, mean that I do not want to use the data in that row at all. My idea is that by creating these rankings and flags, I can filter the data to what I actually want to use. The issue that I'm having is that there are cases where the NumUseFlag is not always sorted in a specific ascending/descending order within each MonthGroup. In the example output I gave, this is the case in MonthGroup 12.

Tom
Super User Tom
Super User

I do not follow.

Do you want to rank by the values of NUMUSEFLAG within the groups defined by MONTHGROUP?

You have 5 levels of MONTHGROUP and only 9 observations.  So not much to rank in this example.

Tom_0-1739318516555.png

 

 

Sounds more like the  issue is how to sort by something other than the values in your variables.  The answer to that is to make a new variable (or variables) that can be used to sort. 

 

For example if you want missing values sorted first  but the non missing values sorted descending here are a couple of options.

1) Make a new variable that indicates if the value is missing or not and include both variables when sorting.

 

2) Make a new variable with the negative of the current value and then sort ascending by that variable instead of sorting by the original variable.  So the missing values will still be first.  But since the largest positive value is the smallest negative value then sorting them in ascending order will get what you want.

 

If the goal is not to "rank" the values, but instead to just number them then once you have the observations in the order you want you can just use simple data step to number them.

wh95
Fluorite | Level 6

This is what I have

wh95_0-1739375324656.png

This is what I want

wh95_1-1739375351376.png

Tom
Super User Tom
Super User

But what is the RULE?

 

If the problem is you want to preserve the current order, but do not have vairables you can use to do that with PROC SORT, then add a new variable that does.

data step1;
   row+1;
   set have;
run;

proc sort data=step1;
  by monthgroup ;
run;

proc rank data=step1 out=want ties=dense;
  by monthgroup;
  var numuseflag;
  rank flagrank;
run;

proc sort data=want;
  by row;
run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1215 views
  • 0 likes
  • 3 in conversation