DATA Step, Macro, Functions and more

PROC RANK for mutiple by Values

Reply
Occasional Contributor
Posts: 17

PROC RANK for mutiple by Values

I'm attempting to rank a large dataset by multiple variables. At first I thought it was simple but in reviewing the output, I don't think I'm using the PROC RANK properly. Is it possible that the data having missing values is causing this to happen? I've been searching around and can't find anythat that relates to my specific issue. Any help is greatly appreciated!

 

The data I have is sorted like this.

 

P_IDCLASSR_COUNTAVG_VALCODE1CODE2CODE3CODE4CODE5CODE6CODE7CODE8CODE9CODE10
ID1A08305$33.70EERT01         
ID2A07557$34.28EERT01         
ID3A17039$34.68EERT01         
ID4A25478$34.30EERT01         
ID5A35239$36.66EERT01         
ID6A45213$36.03EERT01         
ID7A55080$33.96EERT01         
ID8A65033$33.70EERT01         
ID9A75031$33.70EERT01         
ID10A84776$33.70EERT01         
ID11A94740$33.70EERT01         
ID12A104505$35.10EERT01         
ID13A114503$34.21EERT01         
ID14A124443$35.54EERT01         
ID15A134416$34.07EERT01         
ID16A144305$33.78EERT01         
ID17A154266$33.70EERT01         
ID18A164233$33.70EERT01         
ID19A173950$34.91EERT01         
ID20A183903$35.95EERT01         
ID21A193890$36.04EERT01         
ID22A203878$33.70EERT01         

 

 

Here's the code I'm using to rank

 

proc rank data= have OUT= want ties=low descending ;
 sysecho 'Ranking p_ID R_COUNT';
 by CODE1-CODE10;
 var R_COUNT;
 ranks P_RANK;
run;

This is the output I get:

 

 

P_IDCLASSR_COUNTAVG_VALCODE1CODE2CODE3CODE4CODE5CODE6CODE7CODE8CODE9CODE10P_RANK
ID1A08305$33.70EERT01         1
ID2A07557$34.28EERT01         1
ID3A17039$34.68EERT01         1
ID4A25478$34.30EERT01         1
ID5A35239$36.66EERT01         1
ID6A45213$36.03EERT01         1
ID7A55080$33.96EERT01         1
ID8A65033$33.70EERT01         63
ID9A75031$33.70EERT01         63
ID10A84776$33.70EERT01         63
ID11A94740$33.70EERT01         63
ID12A104505$35.10EERT01         63
ID13A114503$34.21EERT01         63
ID14A124443$35.54EERT01         63
ID15A134416$34.07EERT01         63
ID16A144305$33.78EERT01         63
ID17A154266$33.70EERT01         63
ID18A164233$33.70EERT01         63
ID19A173950$34.91EERT01         1
ID20A183903$35.95EERT01         1
ID21A193890$36.04EERT01         3
ID22A203878$33.70EERT01         3

 

This is the output I want:

 

P_IDCLASSR_COUNTAVG_VALCODE1CODE2CODE3CODE4CODE5CODE6CODE7CODE8CODE9CODE10P_RANK
ID1A08305$33.70EERT01         1
ID2A07557$34.28EERT01         2
ID3A17039$34.68EERT01         3
ID4A25478$34.30EERT01         4
ID5A35239$36.66EERT01         5
ID6A45213$36.03EERT01         6
ID7A55080$33.96EERT01         7
ID8A65033$33.70EERT01         8
ID9A75031$33.70EERT01         9
ID10A84776$33.70EERT01         10
ID11A94740$33.70EERT01         11
ID12A104505$35.10EERT01         12
ID13A114503$34.21EERT01         13
ID14A124443$35.54EERT01         14
ID15A134416$34.07EERT01         15
ID16A144305$33.78EERT01         16
ID17A154266$33.70EERT01         17
ID18A164233$33.70EERT01         18
ID19A173950$34.91EERT01         19
ID20A183903$35.95EERT01         20
ID21A193890$36.04EERT01         21
ID22A203878$33.70EERT01         22

 

Super User
Posts: 19,789

Re: PROC RANK for mutiple by Values

Posted in reply to chrishull

You don't need PROC RANK but an enumerator. 

 

I'm not sure about your sort order, it doesn't seem to be only CODE1-CODE10. 

Once you've determined your sort the following is what you need. 

 

Proc sort data=have out=sorted;

by <sort variables>;

run;

 

Data want;

set sorted;

by <sort variables>;

 

retain rank 0;

rank+1;

 

run;

Super User
Posts: 11,343

Re: PROC RANK for mutiple by Values

Posted in reply to chrishull

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... shows how to create example data in the form of SAS datastep code that can be pasted here. This works better than pasting some table format that wants to overflow the display area of the forum and allows us to test code agains the data easier.

 

If you have multiple By variables it helps to show some data with more than one of them populated.

Ask a Question
Discussion stats
  • 2 replies
  • 132 views
  • 0 likes
  • 3 in conversation