Help using Base SAS procedures

Rank on Groups?

Accepted Solution Solved
Reply
Regular Contributor
Posts: 164
Accepted Solution

Rank on Groups?

Hi

I am using Base SAS 9.3 on a Windows 7 computer.

I have become quite adept at creating data sets in SAS but leave them at the record level, export them and report on the data sets with other software.  For this exercise, I need to group and rank the data which is totally out of my depth so hopefully someone can help.

My data is where one record is one hospital visit which could be ED or inpatient.  I've created a field called Type which identifies ED or IP. The unique identifier per patient is chart number.

I need to come up with a single rank per chart number which will consist of the ED visits rank + inpatient visits rank + total length of stay rank.  So I need to summarize the data by each of the criteria (# of ED visits, # of IP visits and # of inpatient days) and then rank each which is simply sorting in ascending order and assigning a rank based on where that chart number appears in the list for that criteria.

So if chart # 12345 is 45 on the ED visits summary, 28 on the inpatient visits summary and 100 on the inpatient days summary then that chart number's total rank is 173.  Ideally I'd like to end up with a list by chart number showing rank of ED visits, rank of inpatient visits and rank of inpatient days plus total rank.

Is this possible?  Thanks for any and all assistance.


Accepted Solutions
Solution
‎02-05-2013 10:00 AM
Regular Contributor
Posts: 164

Re: Rank on Groups?

Hi

Because I had inpatient and ED visits all in one file, my first step was to create an ED only subset with the data elements I wanted.  From there I used proc summary to get total ED visits and then a formula to rank and order them.  Then I combined each subset together for one table showing health card number (this is the more unique value), each of the visits/los, and a rank and order for each element which then also allowed me to sum them.  So I'm not sure it's right but it worked.

Below is my code for ED visits.

/*ED Visits*/
data ED_Activity;
set data_final;
if type = "ED" then output;
run;

proc summary data=ED_activity;
var ed_visits;
class HCN_NoVC;
output out=ed_visits
          sum(ed_visits)=ttl_ed
       ;
run;
proc sort data=ed_visits; by ttl_ed; run;

data ED;
set ed_visits;
  /*by HCN_NoVC descending ttl_ed;*/
ed_order+1-ed_order*first.HCN_NoVC;
ed_rank+ifn(ttl_ed=lag(ttl_ed),0,1)-ed_rank*first.HCN_NoVC;
run;

View solution in original post


All Replies
Frequent Contributor
Frequent Contributor
Posts: 94

Re: Rank on Groups?

You can use proc rank to get ranks based on multiple variables.

See http://support.sas.com/documentation/cdl/en/proc/65145/HTML/default/viewer.htm#n18s1uw2tqvtxdn1chay1...

Super User
Posts: 19,815

Re: Rank on Groups?

Yes it's possible. Smiley Happy

I think  you need to break this down into steps.

1. Calculate the number of visits per chart and type (Proc freq or proc means) - 2 datasets out, one for ED, one for IP

2. Calculate the number of inpatient days per chartno (proc means) 1 dataset out, for LOS

3. Rank them (either a datastep or proc rank) 3 datasets, IP, ED, LOS

4. Merge results together by chartno to calculate the total rank, which will als include your list by chart and ranks. Make sure in the merge you get all the chartno and aren't dropping any because they're not in one file!

Good luck!

Regular Contributor
Posts: 164

Re: Rank on Groups?

Wow this group is fast to respond...thanks so much!

VD, I reviewed that support document prior to posting and my issue is that there isn't a number field for ED or IP visit, just a field of Type which indicates IP or ED.  Of course I guess I could create two number fields for these where if critiera fits then 1 so that is an option.

Reeza:  thanks for breaking it into steps for me, it makes sense.  My only issue is that I have never used Rank or Proc Means or Proc Freq so not sure where to start on that.

Thanks again everyone.

Super User
Posts: 19,815

Re: Rank on Groups?

Search Lexjansen.com for some basic papers on proc freq and proc means would be my suggestion.

Your output is fairly basic so it should be relatively straightforward.

The examples in the documentation for these procedures are also useful.

Regular Contributor
Posts: 164

Re: Rank on Groups?

Hi

Can someone please give me a starter?  Where I'm confused is because I want it to group on chart number and sum ed visits (or use the where statement of where type = ED).  But I don't see how to do that in the samples provided online because their examples don't have many, many extra columns of data like I do, only the ones they will be using in their summary.

Thanks.

Super User
Posts: 19,815

Re: Rank on Groups?

You don't need all your extra columns though, so you could reduce it down Smiley Happy.

Can you post a small sample of what your data would look like in a datastep?

Solution
‎02-05-2013 10:00 AM
Regular Contributor
Posts: 164

Re: Rank on Groups?

Hi

Because I had inpatient and ED visits all in one file, my first step was to create an ED only subset with the data elements I wanted.  From there I used proc summary to get total ED visits and then a formula to rank and order them.  Then I combined each subset together for one table showing health card number (this is the more unique value), each of the visits/los, and a rank and order for each element which then also allowed me to sum them.  So I'm not sure it's right but it worked.

Below is my code for ED visits.

/*ED Visits*/
data ED_Activity;
set data_final;
if type = "ED" then output;
run;

proc summary data=ED_activity;
var ed_visits;
class HCN_NoVC;
output out=ed_visits
          sum(ed_visits)=ttl_ed
       ;
run;
proc sort data=ed_visits; by ttl_ed; run;

data ED;
set ed_visits;
  /*by HCN_NoVC descending ttl_ed;*/
ed_order+1-ed_order*first.HCN_NoVC;
ed_rank+ifn(ttl_ed=lag(ttl_ed),0,1)-ed_rank*first.HCN_NoVC;
run;

Super User
Posts: 19,815

Re: Rank on Groups?

That looks mostly ok, my only question is the sum(ed_visits). Assuming each chartno had 1 visit per line, wouldn't it have been n(ed_visits) instead?  If ed_visits was 1 per count then it would have been fine as well Smiley Happy

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 326 views
  • 6 likes
  • 3 in conversation