BookmarkSubscribeRSS Feed
Stu1979
Calcite | Level 5

I have a doubt that I believe has an easy solution but need your help to find it!

I have a table with 8 columns. the first is my patientID, the next 6 are profile patient information with categorical values (eg: location, gender, age, disease, etc) and the last column is a continuous vAriable that tells me how much they paid in the hospital. I need to perform two pieces of analysis:

  1) identify those groups (based on all the possible combinations) that maximize, by patient, the total spent

  2) same as 1) but creating a rule stating that i only want groups with more than x patients.

how can i do this in sas eg/em/base?

tks.

Stu

1 REPLY 1
Astounding
PROC Star

First, let's define what the problem asks for.  I would interpret this as finding the groups with the largest average spent.  If that's not right, you'll have to explain what you mean by "maximize, by patient, the total spent".

Second, this is an easy task for SAS in theory.  You can get the average spent for every possible group easily:

proc summary data=have missing;

   class location gender age disease /* plus 2 more variables not named in the problem */;

   var amount_paid;

   output out=stats mean=avg_paid;

run;

The output data set STATS will even contain _FREQ_, holding the number of patients in the group.  So applying rules about minimum group size is easy.

The trick will be whether your machine has enough memory to compute statistics for all groups at the same time.  If you don't run out of memory, the continuation is easy:

proc sort data=stats;

   by descending avg_paid;

run;

proc print data=stats (obs=50);

run;

You will need to learn a few things about the CLASS statement:  how to translate from _TYPE_ to the group definition, and how the CLASS statement handles missing values (assuming that your data actually contains some missing values).

If you do run out of memory, a more complex strategy would be necessary.  But this is a good place to start.

Good luck.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1205 views
  • 0 likes
  • 2 in conversation