/* I have a dataset that contains summarized data on a group of people who received healthcare services. The data is summarized by several characteristics of the people, such as race, gender, marital status, etc. In my example below, I have created a fake dataset of the summary race data. Race is a character variable with different values of race (race A, race B, etc.). Population_total is the number of individuals (n) in the particular race category. Service_min is the number of healthcare service minutes summed across all individuals in the race category. Min_per_pop is the average number of service minutes per individual in the race category: min_per_pop = service_min / population_total What is the best way to determine whether there is a significant difference in service minutes across the categories of race, using this summary data? */ data summarydata;
length race $6 population_total 8 service_min 8 min_per_pop 8;
input race $ population_total service_min min_per_pop;
infile datalines dsd dlm='|' ;
datalines;
race_A|42188|94961594|2250.9148
race_B|13820|32049662|2319.0783
race_C|7062|9109865|1289.9837
race_D|350|516013|1474.3229
;
run; /* I have tried a one-way ANOVA using both proc glm and proc anova per the following code, but they do not return any p-value or significance test results. The F value and p-value are blank. */ proc glm data=summarydata;
class race;
model min_per_pop = race;
run;
quit;
proc anova data=summarydata;
class race;
model min_per_pop = race;
run;
quit; /* I have also tried proc logistic, using the counts instead, but it creates this error: ERROR: No valid observations due either to missing values in the response, explanatory, frequency, or weight variable, or to nonpositive frequency or weight values. */ proc logistic data=summarydata;
class race;
model service_min/population_total =race;
run; /* What am I doing wrong? Or what is a better way to test for significant differences? I also know there is a macro %SUM_GLM that can be used for a one-way ANOVA on summary data, but it requires the standard deviation, which I do not have. I only have the 3 numeric measures above. */
... View more