BookmarkSubscribeRSS Feed
uabcms
Calcite | Level 5
I am trying to output the top 80% of the values of a variable
across by groups.
Ex: top 80% of DRGs (by volume) across hospital service lines.
1 REPLY 1
deleted_user
Not applicable
The only tricky part of this process is getting the denominator for each by group and making that available to your unsummarised data. Since you are working with "by groups", then I'll assume your data are already ordered in group sequence. If it isn't, then sort the data first.

Then summarise the data so you get frequencies for each group. The Freq procedure is best for this approach and code like the following should suffice.

[pre]
Proc Freq Data = SERVICES;
By DRG / Output Out = DRGFREQ;
Run;
[/pre]

Now match merge your two tables using the DRG key. There will be a column called COUNT, which is the frequency of each DRG group. You can rename this on merging if a column of the same name already exists on your unsummarised data.

Now split your data with code like the following:

[pre]
Data TOP80
LAST20;
Set SERVICES;
By DRG;
If First.DRG Then GROUPFREQ = 0;
GROUPFREQ ++ 1;
If GROUPFREQ / COUNT <= 0.8 Then Output TOP80;
Else Output LAST20;
Run;
[/pre]

I don't have a sample of your data, or a structure so the code above is provided as an example only.

Good luck.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 628 views
  • 0 likes
  • 2 in conversation