Hello All,
I am trying to rank some data using Proc Rank and a group by clause but to be honest I don't even know where to start.
1. The 4 ranking columns should use the following group by clauses to define their ranking
2. The ranking should stay the same if the group by clause returns the exact same result
rank_ftp > group by system_id, reporting_date, ftp
rank_preprocess > group by system_id, reporting_date, preprocess
rank_calculation > group by system_id, reporting_date, calculation
rank_synchronisation > group by system_id, reporting_date, synchronisation
Until now I came up with this (for the ftp part), but it gives me rank 0 for everything.
proc sort data=WORK.LFS_VAA_BC_TMP out=work.sorted;
by system_id reporting_date;
run;
proc rank data=WORK.sorted out=work.rankings;
var ftp;
ranks rank_ftp;
by system_id reporting_date;
run;
Thanks in advance.
That sounds like enumeration rather than ranking.
Enumeration is when you count records to identify groups.
Ranking is grouping data based on a variables value so that the top 10 are together for example.
https://stats.idre.ucla.edu/sas/faq/how-can-i-create-an-enumeration-variable-by-groups/
Can you include some data so we can replicate your issue?
@MihaiViju2 wrote:
Until now I came up with this (for the ftp part), but it gives me rank 0 for everything.
Your results also show a rank of 1/2 so I'm confused. SAS does the ranks from 0 to the number of groups. You didn't specify a ranking methodology either, what type of ranks do you want to calculate? Break it into deciles, quartlies, groups of 3?
I added the data in the excel sheet. If you remove the ranking columns then you have the data I am working on.
A bit of background story:
This is an ETL processing street framework. Basicaly a DSR file comes in via ftp then is preprocessed then calculated and then synchronized. For every failed attempt to load the file, a new record is created with the new attempt. The failure can occur anywhere in the chain. So in my case I have rank 1 everywhere for ftp because I am always using the same file. If a new file should be delivered then the ranking will be 2 (as long as the system_id and reporting_date are the same).
My unique identifiers are the system_id and reporting_date. The ranking should start again from 1 when a new file is delivered or a new reporting_date is encountered.
That sounds like enumeration rather than ranking.
Enumeration is when you count records to identify groups.
Ranking is grouping data based on a variables value so that the top 10 are together for example.
https://stats.idre.ucla.edu/sas/faq/how-can-i-create-an-enumeration-variable-by-groups/
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.