I'm attempting to rank a large dataset by multiple variables. At first I thought it was simple but in reviewing the output, I don't think I'm using the PROC RANK properly. Is it possible that the data having missing values is causing this to happen? I've been searching around and can't find anythat that relates to my specific issue. Any help is greatly appreciated!
The data I have is sorted like this.
P_ID | CLASS | R_COUNT | AVG_VAL | CODE1 | CODE2 | CODE3 | CODE4 | CODE5 | CODE6 | CODE7 | CODE8 | CODE9 | CODE10 |
ID1 | A0 | 8305 | $33.70 | EERT01 | |||||||||
ID2 | A0 | 7557 | $34.28 | EERT01 | |||||||||
ID3 | A1 | 7039 | $34.68 | EERT01 | |||||||||
ID4 | A2 | 5478 | $34.30 | EERT01 | |||||||||
ID5 | A3 | 5239 | $36.66 | EERT01 | |||||||||
ID6 | A4 | 5213 | $36.03 | EERT01 | |||||||||
ID7 | A5 | 5080 | $33.96 | EERT01 | |||||||||
ID8 | A6 | 5033 | $33.70 | EERT01 | |||||||||
ID9 | A7 | 5031 | $33.70 | EERT01 | |||||||||
ID10 | A8 | 4776 | $33.70 | EERT01 | |||||||||
ID11 | A9 | 4740 | $33.70 | EERT01 | |||||||||
ID12 | A10 | 4505 | $35.10 | EERT01 | |||||||||
ID13 | A11 | 4503 | $34.21 | EERT01 | |||||||||
ID14 | A12 | 4443 | $35.54 | EERT01 | |||||||||
ID15 | A13 | 4416 | $34.07 | EERT01 | |||||||||
ID16 | A14 | 4305 | $33.78 | EERT01 | |||||||||
ID17 | A15 | 4266 | $33.70 | EERT01 | |||||||||
ID18 | A16 | 4233 | $33.70 | EERT01 | |||||||||
ID19 | A17 | 3950 | $34.91 | EERT01 | |||||||||
ID20 | A18 | 3903 | $35.95 | EERT01 | |||||||||
ID21 | A19 | 3890 | $36.04 | EERT01 | |||||||||
ID22 | A20 | 3878 | $33.70 | EERT01 |
Here's the code I'm using to rank
proc rank data= have OUT= want ties=low descending ;
sysecho 'Ranking p_ID R_COUNT';
by CODE1-CODE10;
var R_COUNT;
ranks P_RANK;
run;
This is the output I get:
P_ID | CLASS | R_COUNT | AVG_VAL | CODE1 | CODE2 | CODE3 | CODE4 | CODE5 | CODE6 | CODE7 | CODE8 | CODE9 | CODE10 | P_RANK |
ID1 | A0 | 8305 | $33.70 | EERT01 | 1 | |||||||||
ID2 | A0 | 7557 | $34.28 | EERT01 | 1 | |||||||||
ID3 | A1 | 7039 | $34.68 | EERT01 | 1 | |||||||||
ID4 | A2 | 5478 | $34.30 | EERT01 | 1 | |||||||||
ID5 | A3 | 5239 | $36.66 | EERT01 | 1 | |||||||||
ID6 | A4 | 5213 | $36.03 | EERT01 | 1 | |||||||||
ID7 | A5 | 5080 | $33.96 | EERT01 | 1 | |||||||||
ID8 | A6 | 5033 | $33.70 | EERT01 | 63 | |||||||||
ID9 | A7 | 5031 | $33.70 | EERT01 | 63 | |||||||||
ID10 | A8 | 4776 | $33.70 | EERT01 | 63 | |||||||||
ID11 | A9 | 4740 | $33.70 | EERT01 | 63 | |||||||||
ID12 | A10 | 4505 | $35.10 | EERT01 | 63 | |||||||||
ID13 | A11 | 4503 | $34.21 | EERT01 | 63 | |||||||||
ID14 | A12 | 4443 | $35.54 | EERT01 | 63 | |||||||||
ID15 | A13 | 4416 | $34.07 | EERT01 | 63 | |||||||||
ID16 | A14 | 4305 | $33.78 | EERT01 | 63 | |||||||||
ID17 | A15 | 4266 | $33.70 | EERT01 | 63 | |||||||||
ID18 | A16 | 4233 | $33.70 | EERT01 | 63 | |||||||||
ID19 | A17 | 3950 | $34.91 | EERT01 | 1 | |||||||||
ID20 | A18 | 3903 | $35.95 | EERT01 | 1 | |||||||||
ID21 | A19 | 3890 | $36.04 | EERT01 | 3 | |||||||||
ID22 | A20 | 3878 | $33.70 | EERT01 | 3 |
This is the output I want:
P_ID | CLASS | R_COUNT | AVG_VAL | CODE1 | CODE2 | CODE3 | CODE4 | CODE5 | CODE6 | CODE7 | CODE8 | CODE9 | CODE10 | P_RANK |
ID1 | A0 | 8305 | $33.70 | EERT01 | 1 | |||||||||
ID2 | A0 | 7557 | $34.28 | EERT01 | 2 | |||||||||
ID3 | A1 | 7039 | $34.68 | EERT01 | 3 | |||||||||
ID4 | A2 | 5478 | $34.30 | EERT01 | 4 | |||||||||
ID5 | A3 | 5239 | $36.66 | EERT01 | 5 | |||||||||
ID6 | A4 | 5213 | $36.03 | EERT01 | 6 | |||||||||
ID7 | A5 | 5080 | $33.96 | EERT01 | 7 | |||||||||
ID8 | A6 | 5033 | $33.70 | EERT01 | 8 | |||||||||
ID9 | A7 | 5031 | $33.70 | EERT01 | 9 | |||||||||
ID10 | A8 | 4776 | $33.70 | EERT01 | 10 | |||||||||
ID11 | A9 | 4740 | $33.70 | EERT01 | 11 | |||||||||
ID12 | A10 | 4505 | $35.10 | EERT01 | 12 | |||||||||
ID13 | A11 | 4503 | $34.21 | EERT01 | 13 | |||||||||
ID14 | A12 | 4443 | $35.54 | EERT01 | 14 | |||||||||
ID15 | A13 | 4416 | $34.07 | EERT01 | 15 | |||||||||
ID16 | A14 | 4305 | $33.78 | EERT01 | 16 | |||||||||
ID17 | A15 | 4266 | $33.70 | EERT01 | 17 | |||||||||
ID18 | A16 | 4233 | $33.70 | EERT01 | 18 | |||||||||
ID19 | A17 | 3950 | $34.91 | EERT01 | 19 | |||||||||
ID20 | A18 | 3903 | $35.95 | EERT01 | 20 | |||||||||
ID21 | A19 | 3890 | $36.04 | EERT01 | 21 | |||||||||
ID22 | A20 | 3878 | $33.70 | EERT01 | 22 |
You don't need PROC RANK but an enumerator.
I'm not sure about your sort order, it doesn't seem to be only CODE1-CODE10.
Once you've determined your sort the following is what you need.
Proc sort data=have out=sorted;
by <sort variables>;
run;
Data want;
set sorted;
by <sort variables>;
retain rank 0;
rank+1;
run;
https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... shows how to create example data in the form of SAS datastep code that can be pasted here. This works better than pasting some table format that wants to overflow the display area of the forum and allows us to test code agains the data easier.
If you have multiple By variables it helps to show some data with more than one of them populated.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.