Help using Base SAS procedures

MACROs, loops or arrays to automate rates calculated from multiple columns

Reply
Occasional Contributor
Posts: 16

MACROs, loops or arrays to automate rates calculated from multiple columns

[ Edited ]

I'm not sure what the best approach to this would be, but I have a data set with columns named 'c21-c80' that contain denominators and numerators of certain measures of quality ('efficiency','timeliness','cost', etc). 

 

A measure could have multiple components depending on age group for example:

 

Efficiency (age 18-25): c22/c21

Efficiency (age 26-34): c24/c23

Efficiency (all ages): c22+c24/c21+c23

Cost (no age stratification): c26/c25

 

There's no pattern to which measures have multiple components and which are just one column/another column, so I was calculating measures separately by listing the exact columns.  However updates to the source data might render certain columns into different measures, so it gets confusing/inefficient changing each column manually.  

 

data msr_results;
		infile DATALINES;
  input company $ c21 c22 c23 c24 c25 c26;
  datalines;
  0001 100 89  220  180  1000 756
  0002 403 322 800  650  5000 3000
  0003 26  15  50   45   350  180
  0004 30  11  60   57   290  90
;

 

I am using something like this:

 

proc sql;
create table msr_results2 as
select company, (c22/c21*100) as rate, 'EFF' as msr_type, 'EFF_1' as sub from msr_results
union
select company, ((c22+c24)/(c21+c23)*100) as rate, 'EFF' as msr_type , 'EFF_2' as sub from msr_results
; quit;

 

This gives a basic summary data set with the rates and type of measure, and sub measure flags, grouped by company ID.  I'm wondering if there was a better/more efficient way to do this, given that the measure calculations aren't simply c22/21, c24/c23, etc, etc.

 

Capture.PNG

 

I thought of using a separate data set as a 'key' like this and turning some of the values into MACRO variables and then looping over the whole data set but am not sure how to approach that some measures have multiple columns in the calculations, so I thought of setting a level and then having separate statements for each level but that could be inefficient, if there were up to 10 columns, for example.  

 

data msr_key;
infile DATALINES;
input level msr_type $ sub $ order num_key $ num_key2 $ den_key $ den_key2 $;
datalines;
1	EFF EFF_1 1 c22 NA c21 NA 
2 EFF EFF_2 1 c22 c24 c21 c23
1 COS COS_1 2 c26 NA c25 NA 
;
run;

   

Super User
Posts: 21,562

Re: MACROs, loops or arrays to automate rates calculated from multiple columns

Posted in reply to appleorange

I've had to deal with that and chose to make my data in a long format. 

I added in the denominator factor as a Parent column so I could look it up via a join

I added in a GROUP variable to group variables together. 

 

This made it much easier to streamline my analysis and automating it going forward. 

 

This amount of work was only worth it because this data will be used multiple times over the next 6 years.

Occasional Contributor
Posts: 16

Re: MACROs, loops or arrays to automate rates calculated from multiple columns

Can you explain denominator factor and looking it up via a join? Do you mean which columns must be added if a measure has multiple columns?

It DOES make sense to convert the data into long format, I will try that.
Highlighted
Super User
Posts: 21,562

Re: MACROs, loops or arrays to automate rates calculated from multiple columns

Posted in reply to appleorange

c22/21, c24/c23

You stated those were some formulas

 

In this case for the records with C22, I have a column called ParentID which is the total, which would be C21

 

Ie

MeasureID MeasureDesc ParentID1 ParentID2
1 Total Population    
2 Population 0 to 14 1  
3 Population 0 to 4 1 2
4 Population 5 to 10 1 2

 

Ask a Question
Discussion stats
  • 3 replies
  • 172 views
  • 0 likes
  • 2 in conversation