Statistical Procedures

annisann · Posted 08-18-2024 06:12 PM

/*

I have a dataset that contains summarized data on a group of people who received

healthcare services. The data is summarized by several characteristics of the people,

such as race, gender, marital status, etc. In my example below, I have created a

fake dataset of the summary race data.

Race is a character variable with different values of race (race A, race B, etc.).

Population_total is the number of individuals (n) in the particular race category.

Service_min is the number of healthcare service minutes summed across all individuals in the race category.

Min_per_pop is the average number of service minutes per individual in the race category:

min_per_pop = service_min / population_total

What is the best way to determine whether there is a significant difference in service

minutes across the categories of race, using this summary data?

*/

data summarydata;
length race $6 population_total 8 service_min 8 min_per_pop 8;
input race $ population_total service_min min_per_pop;
infile datalines dsd dlm='|' ;
datalines;
race_A|42188|94961594|2250.9148
race_B|13820|32049662|2319.0783
race_C|7062|9109865|1289.9837
race_D|350|516013|1474.3229
;
run;

/* I have tried a one-way ANOVA using both proc glm and proc anova per the following code,

but they do not return any p-value or significance test results. The F value and p-value are blank.

*/

proc glm data=summarydata;
class race;
model min_per_pop = race;
run;
quit;

proc anova data=summarydata;
class race;
model min_per_pop = race;
run;
quit;

/* I have also tried proc logistic, using the counts instead, but it creates this error:

ERROR: No valid observations due either to missing values in the response,

explanatory, frequency, or weight variable, or to nonpositive frequency or

weight values.

*/

proc logistic data=summarydata;
class race;
model service_min/population_total =race;
run;

/* What am I doing wrong? Or what is a better way to test for significant differences?

I also know there is a macro %SUM_GLM that can be used for a one-way ANOVA on summary

data, but it requires the standard deviation, which I do not have. I only have the 3 numeric measures above.

*/

PaigeMiller · Posted 08-18-2024 06:20 PM

If you don't have a standard deviation of the minutes (or the raw data), then you cannot perform a statistical test.

--
Paige Miller

View solution in original post

PaigeMiller · Posted 08-18-2024 06:20 PM

If you don't have a standard deviation of the minutes (or the raw data), then you cannot perform a statistical test.

--
Paige Miller

annisann · Posted 08-25-2024 05:51 PM

thank you!

Ksharp · Posted 08-18-2024 09:10 PM

You need MEANS statement of PROC GLM to do ANOVA h-test and LSMEANS statment to "Test significant difference in mean across values of categorical varia".

I also noticed that there are only one obs for one race in your dataset,
You need include 'Population_total ' variable in PROC GLM via FREQ statement.


proc glm data=have ;
class race;
model min_per_pop = race;
means race / hovtest=levene(type=abs) tukey;
freq Population_total ;
quit;

And LSMEANS statement.

data have;
length race $6 population_total 8 service_min 8 min_per_pop 8;
input race $ population_total service_min min_per_pop;
infile datalines dsd dlm='|' ;
datalines;
race_A|42188|94961594|2250.9148
race_B|13820|32049662|2319.0783
race_C|7062|9109865|1289.9837
race_D|350|516013|1474.3229
;
run;
proc glm data=have ;
class race;
model min_per_pop  = race;
means race / hovtest=levene(type=abs) tukey;
lsmeans race/adjust=tukey;
freq Population_total ;
quit;

Statistical Procedures

Test significant difference in mean across values of categorical variable using summary dataset

Re: Test significant difference in mean across values of categorical variable using summary dataset

Re: Test significant difference in mean across values of categorical variable using summary dataset

Re: Test significant difference in mean across values of categorical variable using summary dataset

Re: Test significant difference in mean across values of categorical variable using summary dataset

Follow Us

What is...

Statistical Procedures

Test significant difference in mean across values of categorical variable using summary dataset

Re: Test significant difference in mean across values of categorical variable using summary dataset

Re: Test significant difference in mean across values of categorical variable using summary dataset

Re: Test significant difference in mean across values of categorical variable using summary dataset

Re: Test significant difference in mean across values of categorical variable using summary dataset

Special offer for SAS Communities members

Follow Us

What is...