turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Comparing means between several groups

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-29-2017 08:40 AM

Hello,

I asking for your help.

I have 3 variables that divided to 4 groups (Quantitative variables).

I intend to check if there is any difference between the groups for each variable.

My difficulty in using ANOVA - it indicates that one of the group is different but it does not indicate which one of the group is different.

Can you please advise what king of test to use in order to get indication which group is different in relate to the other. Should I use TTEST in the following way: checking one group in relate to the other 3 groups together (=as another one group)?

I hope my question is clear...

Thanks a lot!

Accepted Solutions

Solution

03-30-2017
03:55 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to yael

03-29-2017 01:58 PM

One way to group values for analysis is to use a custom format. This has an advantage that you do not need to add variables.

Then use the CLASS statement to indicate the variable used to define the classification groups for the test.

proc format library=work; value firstgroup 1 = '1' 2,3,4 = '2-4'; value secgroup 2='2' 1,3,4 = '1,3,4'; run; proc ttest data=sasuser.sasfile010916; class stage; format state firstgroup.; var ROA1 ROA2 ROA3 ROA4 ROE1 ROE2 ROIC Q LNQ ; run; proc ttest data=sasuser.sasfile010916; class stage; format state secgroup.; var ROA1 ROA2 ROA3 ROA4 ROE1 ROE2 ROIC Q LNQ ; run;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to yael

03-29-2017 09:23 AM

Use the MEANS statement in PROC ANOVA or PROC GLM.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

03-29-2017 10:17 AM

@PaigeMiller - thanks for your answer!

But if I want to test it with proc ttest, I want to compare between **1 group** in relate to **ALL 3 OTHER GROUPS TOGETHER**. So, what is wrong with my following code:

proc ttest data=sasuser.sasfile010916;

where stage=1;

var ROA1 ROA2 ROA3 ROA4 ROE1 ROE2 ROIC Q LNQ ;

run;

Remarks - the groups are in variable STAGE that has the values - 1,2,3,4. How can I relate to groups 2-4 together (as 1 group)?

Thanks

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to yael

03-29-2017 10:29 AM

You want to use the CONTRAST statement in PROC GLM (not PROC ANOVA).

If you scroll down to the last example in the CONTRAST statement documentation, it shows how you would compare a control group to the average of the others, which if I am understanding you properly is what you are trying to do.

--

Paige Miller

Paige Miller

Solution

03-30-2017
03:55 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to yael

03-29-2017 01:58 PM

One way to group values for analysis is to use a custom format. This has an advantage that you do not need to add variables.

Then use the CLASS statement to indicate the variable used to define the classification groups for the test.

proc format library=work; value firstgroup 1 = '1' 2,3,4 = '2-4'; value secgroup 2='2' 1,3,4 = '1,3,4'; run; proc ttest data=sasuser.sasfile010916; class stage; format state firstgroup.; var ROA1 ROA2 ROA3 ROA4 ROE1 ROE2 ROIC Q LNQ ; run; proc ttest data=sasuser.sasfile010916; class stage; format state secgroup.; var ROA1 ROA2 ROA3 ROA4 ROE1 ROE2 ROIC Q LNQ ; run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

03-30-2017 01:58 AM

@ballardw Thanks for your answer. You meant to **format stage firstgroup** and not** format state firstgroup**, right?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to yael

03-30-2017 11:43 AM

yael wrote:

format stage firstgroupand notformat state firstgroup, right?

Correct

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

03-30-2017 10:55 AM

ballardw wrote:

One way to group values for analysis is to use a custom format. This has an advantage that you do not need to add variables.

Then use the CLASS statement to indicate the variable used to define the classification groups for the test.

proc format library=work; value firstgroup 1 = '1' 2,3,4 = '2-4'; value secgroup 2='2' 1,3,4 = '1,3,4'; run; proc ttest data=sasuser.sasfile010916; class stage; format state firstgroup.; var ROA1 ROA2 ROA3 ROA4 ROE1 ROE2 ROIC Q LNQ ; run; proc ttest data=sasuser.sasfile010916; class stage; format state secgroup.; var ROA1 ROA2 ROA3 ROA4 ROE1 ROE2 ROIC Q LNQ ; run;

Hello, @ballardw, it seems to me that this doesn't give the same answer as a CONTRAST statement in a PROC GLM analysis, because this PROC TTEST would use a different error term. In my understanding of the matter, I think the proper error term would indeed be the analysis-of-variance (PROC GLM) error term, and not the error derived via this PROC TTEST, specifically because the PROC GLM analysis-of-variance method would account for the group-to-group differences and not put this into the error term; however the PROC TTEST method just considers group-to-group diffferences to be part of the overall error term.

What do you think?

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

03-30-2017 11:46 AM

@PaigeMiller agreed.

The example was mostly to show the use of the format to create groups. And since the OP mention ttest use that as an example.

Sort of the "know thy data" before picking the analysis technique.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

03-31-2017 08:24 AM

Dear @PaigeMiller

I wonder if you can help me with the contrast statement,

I want to compare between stage 2 to stage 1,3,4,5 together,

The code is:

proc glm data=sasuser.sasfile010916;

class stage;

model ROA1=stage;

means stage/deponly;

contrast 'Compare 2 vs 1,3,4,5 together' stage 1 1 -4 1 1;

run;

Am I right?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to yael

03-31-2017 08:27 AM

yael wrote:

Dear @PaigeMiller

I wonder if you can help me with the contrast statement,

I want to compare between stage 2 to stage 1,3,4,5 together,

The code is:

proc glm data=sasuser.sasfile010916;

class stage;

model ROA1=stage;

means stage/deponly;

contrast 'Compare 2 vs 1,3,4,5 together' stage 1 1 -4 1 1;

run;

Am I right?

Following the example in the CONTRAST statement documentation, you'd want

`contrast 'Compare 2 vs 1,3,4,5' stage -0.25 1 -0.25 -0.25 -0.25;`

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PaigeMiller

03-31-2017 09:00 AM

@PaigeMiller Thanks for your fast reply and shabat shalom

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to yael

04-03-2017 02:27 AM

Dear @PaigeMiller, I understand that I should use PROC GLM instead of TTEST because of the error terms (if I compare more than 2 groups). I really want to understand it deeply. Can you recommand what is the specific subject that I have to look for in order to get more understanding about it? Thanks

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to yael

04-03-2017 07:39 AM

The area of study is called "multiple comparisons." I would start with the PROC GLM doc. If you want a book, I highly recommend Multiple Comparisons and Multiple Tests by Westfall, Tobias, and Wolfinger.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to yael

04-03-2017 08:01 AM

With all due respect to @Rick_SAS, I don't think that book is the place to look for the answer, I don't consider the question posed in this thread to be a multiple comparison question. I'm not sure what reading I would recommend.

The bottom line is that the PROC TTEST treats the analysis as if there are only two groups, the group of interest and all the other groups combined. So the actual group to group variability between "all the other groups combined" is considered by PROC TTEST to be random variation, used in the error term of the analysis. In PROC GLM, the difference betwee all 5 groups is specifically accounted for as the group-to-group variability, and thus it is not lumped into the error term. The comparison of group 2 to the average of all other groups combined is tested against this PROC GLM error term, which is smaller than the error term used by PROC TTEST, because the PROC GLM error term does not have the group-to-group variability included (as the PROC TTEST error term does).

--

Paige Miller

Paige Miller