BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rob9999
Fluorite | Level 6

Hi,

 

This is a naive question, and it's definitely time for me to revisit stat101...

I collected health utility data (range 0-1) from a cohort of patients, and there is no control group in this study. I would like to compare the data collected against 1 published literature controlling for age and sex. However, the published literature only has summary statistics, i.e. mean and standard deviation, and there is also no information on the shape of its data distribution (could be normal, or could be beta?)

 

If we move beyond the point of whether it's appropriate to perform such a comparison (ie. ideally, a control group should have been recruited at the same time). May I know what approach and proc to test whether my data is different from the published literature?

 

Thank you.

 

example:

 

data sample;
 input ID age sex health_u;
cards;
1 18 1 0.75
2 22 1 0.6 
3 40 2 0.88 
4 50 1 0.65 

5 35 2 0.9

6 51 2 0.6

7 33 1 0.8
;

 

example published literature to compare:

age group "16-25" sex=1 mean_health_u=0.76 stdev=0.1;

age group "16-25" sex=2 mean_health_u=0.71 stdev=0.06;

age group "26-35" sex=1 mean_health_u=0.8 stdev=0.10;

age group "26-35" sex=2 mean_health_u=0.6 stdev=0.08;...

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

You can test one sample at a time against a hypothetical mean value with proc ttest.

 

Example:

proc ttest data=sample h0=0.76;
  where (16 le age le 25) and sex=1;
  var health_u;
run;

You would need a separate run for each hypothesized mean value with appropriate selection for age and sex as there really isn't any way I am aware of to provide different means, the h0 value, with a single run of the procedure.

 

If your data and the comparison sample have at least 30 subjects in each age/sex group you shouldn't be in too much trouble. I note that your shown "literature" does not include a sample size. So that might be a concern.

 

Have you looked at your standard deviations by the same groups?

That would be easy, at least for an eyeball comparison, with:

proc format;
value agegroup
16 - 25 = '16 to 25'
26 - 35 = '26 to 35'
;

proc means data=sample mean std;
   class age sex;
   format age agegroup.;
   var health_u;
run;

Formats will create groups honored by most analysis, reporting and graphing procedures. So this should show where your mean and std are at least in the ball park.

 

If you want to determine if your data distribution is the same then you need much more information from the literature and is likely not to be forthcoming.

View solution in original post

1 REPLY 1
ballardw
Super User

You can test one sample at a time against a hypothetical mean value with proc ttest.

 

Example:

proc ttest data=sample h0=0.76;
  where (16 le age le 25) and sex=1;
  var health_u;
run;

You would need a separate run for each hypothesized mean value with appropriate selection for age and sex as there really isn't any way I am aware of to provide different means, the h0 value, with a single run of the procedure.

 

If your data and the comparison sample have at least 30 subjects in each age/sex group you shouldn't be in too much trouble. I note that your shown "literature" does not include a sample size. So that might be a concern.

 

Have you looked at your standard deviations by the same groups?

That would be easy, at least for an eyeball comparison, with:

proc format;
value agegroup
16 - 25 = '16 to 25'
26 - 35 = '26 to 35'
;

proc means data=sample mean std;
   class age sex;
   format age agegroup.;
   var health_u;
run;

Formats will create groups honored by most analysis, reporting and graphing procedures. So this should show where your mean and std are at least in the ball park.

 

If you want to determine if your data distribution is the same then you need much more information from the literature and is likely not to be forthcoming.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 743 views
  • 1 like
  • 2 in conversation