Obsidian | Level 7

## Determine if multiple variables combined statistically differ from each other

Hello,

Newbie to using some of SAS' statistical procedures. I have SAS 9.4. Normally, I am only comparing one variable at a time for a test and control group to see if they statistically differ, but now I want to combine multiple variables together and see if grouped they differ between the test and control group (or even among-st each other in the test group).

My variables are like Region (west coast, south, northeast, etc), age cohort (<30, 30 -39, 40 - 49, etc) , ADI (< 25, 25-49, etc), etc. So, as an example, I want to know if my population in the West Coast aged between 30 - 49 with an ADI of 25 - 49 statistically differ from any other combo in either the control or test group.

What is the best way/procedure to do this?

I thank you for any help/insight you can provide.

Example of data:

``````data data_set;
input Group \$ 1-7 Region \$ 8-10 Age_Cohort \$ 11-16 Key 17-20 Targeted 21-22 Gap_Closed 23-24;
datalines;
Control WC 30-39 123 1 1
Target  WC 30-39 456 1 1
Control WC 30-39 789 1 0
Target  WC 30-39 012 1 1
Control WC 40-49 345 1 1
Target  WC 40-49 678 1 0
Control S  50-59 901 1 0
Target  S  50-59 234 1 0
Control S  60-69 567 1 1
Target  S  60-69 890 1 1
;``````

5 REPLIES 5
Super User

## Re: Determine if multiple variables combined statistically differ from each other

PROC MULTTEST?
Diamond | Level 26

## Re: Determine if multiple variables combined statistically differ from each other

The problem with PROC MULTTEST here is that only one CLASS variable is allowed, while the problem has three CLASS variables, specifically GROUP, REGION, AGE_COHORT (and the text seems to indicate there are more than three, although the data set only has three).

So, the problem really seems to be a three-way (or higher) ANOVA, which can be run in PROC GLM (assuming certain conditions are met). The ADJUST= option of the LSMEANS statement would allow one of the different multiple comparison methods to be used.

--
Paige Miller
Obsidian | Level 7

## Re: Determine if multiple variables combined statistically differ from each other

Yes, that is the case and what I thinking when I was typing the response to Reeza. I will now go research your suggestion: proc glm with adjust option.

Also, I do have more class variables, I was just limiting it for ease of sharing.

Thank you!

Obsidian | Level 7

## Re: Determine if multiple variables combined statistically differ from each other

Hi again,

Got busy with other projections and finally returning to this one.  Below is the code I ended up with. Fingers crossed I am on the right track?!

Based on the results, it only looks like the target vs control is significant (See attachments). I did a separate test using proc freq to test region significance between target and control and it looks like the West Coast targets were statistically different from the West Coast control group. So I was wondering, how exactly do I interpret the two different results using proc glm and just the chisq from the proc freq? Am I not asking the right question with the model statement in proc glm?

``````proc glm data = unixwork.col_interim2 outstat=unixwork.stat_sig_testing;
model Quest_Closed_Gap = Bucket|Region|age_cohort|ADI_cohort / tolerance;
run;
quit;``````

Bucket values are target or control.

Region values are West Coast or South

Age are 55- 59 or 60 - 64

ADI are in quartiles from 0 -100

``````proc sort data= unixwork.col_interim2;
by region bucket Quest_Closed_Gap;
run;
proc summary data=unixwork.col_interim2;
var COL_TARGET;
by region bucket Quest_Closed_Gap;
output out=unixwork.region sum=;
run;
/*ods output PdiffCLs=pdiff;*/
/* Region - does target differ from control group? */
proc freq data=unixwork.region;
by region;
weight COL_TARGET;
table bucket * Quest_Closed_Gap / chisq riskdiff;
output out=unixwork.region_stat chisq;
run;
/* only outputs if statistically differs */
data unixwork.region_stat_sig;
set unixwork.region_stat;
by region;
if P_PCHI < 0.05;
run;
/* west coast targets performed stat better than control; south was similar to control */``````

Thanks again!

Obsidian | Level 7

## Re: Determine if multiple variables combined statistically differ from each other

Thanks for your response. I'm reading up on how exactly to use this procedure, which has given me some follow up questions.

Based on the way my data is layed out, would region, age cohort, adi, etc. actually be groups versus variables? The main measurement I have is gaps closed out of the number targeted (1 = yes, 0=no) then it would be by region, age, etc. within the test and control group. The reason I ask is one article I found states this:

"PROC MULTTEST does not provide closed tests, and therefore, caution is urged, in the following
situations:
• Multiple comparisons of means involving three or more groups, using permutation
resampling
• Multiple comparisons of binary variables involving three or more groups. " https://pdfs.semanticscholar.org/93cd/57288bc50ce9bc99aef89f7cab9f61bc3bbb.pdf

And I'm worried my data might fit that? Unless, I am just thinking about my data backwards. Like I said new to this statistical procedures, so need to read more about the test itself.

Thanks again.

Discussion stats
• 5 replies
• 648 views
• 5 likes
• 3 in conversation