I'm attempting to rank a large dataset by multiple variables. At first I thought it was simple but in reviewing the output, I don't think I'm using the PROC RANK properly. Is it possible that the data having missing values is causing this to happen? I've been searching around and can't find anythat that relates to my specific issue. Any help is greatly appreciated!
The data I have is sorted like this.
P_ID | CLASS | R_COUNT | AVG_VAL | CODE1 | CODE2 | CODE3 | CODE4 | CODE5 | CODE6 | CODE7 | CODE8 | CODE9 | CODE10 |
ID1 | A0 | 8305 | $33.70 | EERT01 | |||||||||
ID2 | A0 | 7557 | $34.28 | EERT01 | |||||||||
ID3 | A1 | 7039 | $34.68 | EERT01 | |||||||||
ID4 | A2 | 5478 | $34.30 | EERT01 | |||||||||
ID5 | A3 | 5239 | $36.66 | EERT01 | |||||||||
ID6 | A4 | 5213 | $36.03 | EERT01 | |||||||||
ID7 | A5 | 5080 | $33.96 | EERT01 | |||||||||
ID8 | A6 | 5033 | $33.70 | EERT01 | |||||||||
ID9 | A7 | 5031 | $33.70 | EERT01 | |||||||||
ID10 | A8 | 4776 | $33.70 | EERT01 | |||||||||
ID11 | A9 | 4740 | $33.70 | EERT01 | |||||||||
ID12 | A10 | 4505 | $35.10 | EERT01 | |||||||||
ID13 | A11 | 4503 | $34.21 | EERT01 | |||||||||
ID14 | A12 | 4443 | $35.54 | EERT01 | |||||||||
ID15 | A13 | 4416 | $34.07 | EERT01 | |||||||||
ID16 | A14 | 4305 | $33.78 | EERT01 | |||||||||
ID17 | A15 | 4266 | $33.70 | EERT01 | |||||||||
ID18 | A16 | 4233 | $33.70 | EERT01 | |||||||||
ID19 | A17 | 3950 | $34.91 | EERT01 | |||||||||
ID20 | A18 | 3903 | $35.95 | EERT01 | |||||||||
ID21 | A19 | 3890 | $36.04 | EERT01 | |||||||||
ID22 | A20 | 3878 | $33.70 | EERT01 |
Here's the code I'm using to rank
proc rank data= have OUT= want ties=low descending ;
sysecho 'Ranking p_ID R_COUNT';
by CODE1-CODE10;
var R_COUNT;
ranks P_RANK;
run;
This is the output I get:
P_ID | CLASS | R_COUNT | AVG_VAL | CODE1 | CODE2 | CODE3 | CODE4 | CODE5 | CODE6 | CODE7 | CODE8 | CODE9 | CODE10 | P_RANK |
ID1 | A0 | 8305 | $33.70 | EERT01 | 1 | |||||||||
ID2 | A0 | 7557 | $34.28 | EERT01 | 1 | |||||||||
ID3 | A1 | 7039 | $34.68 | EERT01 | 1 | |||||||||
ID4 | A2 | 5478 | $34.30 | EERT01 | 1 | |||||||||
ID5 | A3 | 5239 | $36.66 | EERT01 | 1 | |||||||||
ID6 | A4 | 5213 | $36.03 | EERT01 | 1 | |||||||||
ID7 | A5 | 5080 | $33.96 | EERT01 | 1 | |||||||||
ID8 | A6 | 5033 | $33.70 | EERT01 | 63 | |||||||||
ID9 | A7 | 5031 | $33.70 | EERT01 | 63 | |||||||||
ID10 | A8 | 4776 | $33.70 | EERT01 | 63 | |||||||||
ID11 | A9 | 4740 | $33.70 | EERT01 | 63 | |||||||||
ID12 | A10 | 4505 | $35.10 | EERT01 | 63 | |||||||||
ID13 | A11 | 4503 | $34.21 | EERT01 | 63 | |||||||||
ID14 | A12 | 4443 | $35.54 | EERT01 | 63 | |||||||||
ID15 | A13 | 4416 | $34.07 | EERT01 | 63 | |||||||||
ID16 | A14 | 4305 | $33.78 | EERT01 | 63 | |||||||||
ID17 | A15 | 4266 | $33.70 | EERT01 | 63 | |||||||||
ID18 | A16 | 4233 | $33.70 | EERT01 | 63 | |||||||||
ID19 | A17 | 3950 | $34.91 | EERT01 | 1 | |||||||||
ID20 | A18 | 3903 | $35.95 | EERT01 | 1 | |||||||||
ID21 | A19 | 3890 | $36.04 | EERT01 | 3 | |||||||||
ID22 | A20 | 3878 | $33.70 | EERT01 | 3 |
This is the output I want:
P_ID | CLASS | R_COUNT | AVG_VAL | CODE1 | CODE2 | CODE3 | CODE4 | CODE5 | CODE6 | CODE7 | CODE8 | CODE9 | CODE10 | P_RANK |
ID1 | A0 | 8305 | $33.70 | EERT01 | 1 | |||||||||
ID2 | A0 | 7557 | $34.28 | EERT01 | 2 | |||||||||
ID3 | A1 | 7039 | $34.68 | EERT01 | 3 | |||||||||
ID4 | A2 | 5478 | $34.30 | EERT01 | 4 | |||||||||
ID5 | A3 | 5239 | $36.66 | EERT01 | 5 | |||||||||
ID6 | A4 | 5213 | $36.03 | EERT01 | 6 | |||||||||
ID7 | A5 | 5080 | $33.96 | EERT01 | 7 | |||||||||
ID8 | A6 | 5033 | $33.70 | EERT01 | 8 | |||||||||
ID9 | A7 | 5031 | $33.70 | EERT01 | 9 | |||||||||
ID10 | A8 | 4776 | $33.70 | EERT01 | 10 | |||||||||
ID11 | A9 | 4740 | $33.70 | EERT01 | 11 | |||||||||
ID12 | A10 | 4505 | $35.10 | EERT01 | 12 | |||||||||
ID13 | A11 | 4503 | $34.21 | EERT01 | 13 | |||||||||
ID14 | A12 | 4443 | $35.54 | EERT01 | 14 | |||||||||
ID15 | A13 | 4416 | $34.07 | EERT01 | 15 | |||||||||
ID16 | A14 | 4305 | $33.78 | EERT01 | 16 | |||||||||
ID17 | A15 | 4266 | $33.70 | EERT01 | 17 | |||||||||
ID18 | A16 | 4233 | $33.70 | EERT01 | 18 | |||||||||
ID19 | A17 | 3950 | $34.91 | EERT01 | 19 | |||||||||
ID20 | A18 | 3903 | $35.95 | EERT01 | 20 | |||||||||
ID21 | A19 | 3890 | $36.04 | EERT01 | 21 | |||||||||
ID22 | A20 | 3878 | $33.70 | EERT01 | 22 |
You don't need PROC RANK but an enumerator.
I'm not sure about your sort order, it doesn't seem to be only CODE1-CODE10.
Once you've determined your sort the following is what you need.
Proc sort data=have out=sorted;
by <sort variables>;
run;
Data want;
set sorted;
by <sort variables>;
retain rank 0;
rank+1;
run;
https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... shows how to create example data in the form of SAS datastep code that can be pasted here. This works better than pasting some table format that wants to overflow the display area of the forum and allows us to test code agains the data easier.
If you have multiple By variables it helps to show some data with more than one of them populated.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.