BookmarkSubscribeRSS Feed
canolden
Calcite | Level 5

I have multiple categories (Cat1) with many count data rows categorized by sex. I'd like SAS code to sum the count data by sex for each category and test whether there is a significant difference between Male and Female for the sum of counts for each category.

 

The data look like this:

Cat1SexCount
AM1
AF0.8
AF0.7
AM1
BF0.9
BF1
BM0.8
BM1
CF0.9
CM0.7
CF1
CM1

 

For Cat1 A, is F sum of Counts significantly greater than M sum of Counts? The N for each Sex by Cat1 is over 6 and up to 1200.

 

I've been using a MWU due to non-normal distributions, but it's restricted to one variable and I don't want to run it for each category separately if I don't have to.

6 REPLIES 6
PaigeMiller
Diamond | Level 26

I believe this would be a job for PROC TTEST:

 

proc ttest data=have;
    var count;
    class cat1;
run;

Of course, this assumes the variable count is normally distributed, and it probably isn't, but you haven't told us anything about variable COUNT. So, what is the distribution of variable count, and how many data points do you really have?

--
Paige Miller
canolden
Calcite | Level 5
I have over 6000 data lines. This is a clip of the top of one table.

Cat1 N
Sum of
Scores
Expected
Under H0
Std Dev
Under H0
Mean
Score
Dp16M 70 327670.00 230895.00 15037.6074 4681.00000
Do15F 89 212015.00 293566.50 16931.3502 2382.19101
Di9F 254 559325.50 837819.00 28238.1450 2202.06890
Dq17M 69 322989.00 227596.50 14930.9533 4681.00000
Dm13F 152 282112.00 501372.00 22019.4135 1856.00000
Dh8M 237 1109397.00 781744.50 27313.3376 4681.00000
PaigeMiller
Diamond | Level 26

So perhaps I didn't word my question properly.

 

How many observations (on average) do you have for each value of CAT1, and how many of those are male and how many of those are female?

--
Paige Miller
canolden
Calcite | Level 5
That is highly variable. There are 48 levels for the class variable, which t-test doesn't handle. The number of male counts and female counts is different for each level, varying from above 6 counts to hundreds for each sex within level.
PaigeMiller
Diamond | Level 26

@canolden wrote:
That is highly variable. There are 48 levels for the class variable, which t-test doesn't handle. The number of male counts and female counts is different for each level, varying from above 6 counts to hundreds for each sex within level.

My mistake, the code ought to look like this:

 

proc ttest data=have;
    by cat1;
    class sex;
    var count;
run;

but probably @PGStats has a better solution.

--
Paige Miller
PGStats
Opal | Level 21

This is how to request a Wilcoxon test for each cat1 value.

 

proc npar1way data=have wilcoxon plots=none;
by cat1;
class sex;
var count;
output out=stats wilcoxon;
run;
PG

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 445 views
  • 0 likes
  • 3 in conversation