turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Tecnique that maximizes expenses by group

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-27-2013 08:30 AM

I have a doubt that I believe has an easy solution but need your help to find it!

I have a table with 8 columns. the first is my patientID, the next 6 are profile patient information with categorical values (eg: location, gender, age, disease, etc) and the last column is a continuous vAriable that tells me how much they paid in the hospital. I need to perform two pieces of analysis:

1) identify those groups (based on all the possible combinations) that maximize, by patient, the total spent

2) same as 1) but creating a rule stating that i only want groups with more than x patients.

how can i do this in sas eg/em/base?

tks.

Stu

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Stu1979

11-27-2013 10:10 AM

First, let's define what the problem asks for. I would interpret this as finding the groups with the largest average spent. If that's not right, you'll have to explain what you mean by "maximize, by patient, the total spent".

Second, this is an easy task for SAS in theory. You can get the average spent for every possible group easily:

proc summary data=have missing;

class location gender age disease /* plus 2 more variables not named in the problem */;

var amount_paid;

output out=stats mean=avg_paid;

run;

The output data set STATS will even contain _FREQ_, holding the number of patients in the group. So applying rules about minimum group size is easy.

The trick will be whether your machine has enough memory to compute statistics for all groups at the same time. If you don't run out of memory, the continuation is easy:

proc sort data=stats;

by descending avg_paid;

run;

proc print data=stats (obs=50);

run;

You will need to learn a few things about the CLASS statement: how to translate from _TYPE_ to the group definition, and how the CLASS statement handles missing values (assuming that your data actually contains some missing values).

If you do run out of memory, a more complex strategy would be necessary. But this is a good place to start.

Good luck.