Excuse my ignorance I’m new to SAS. Anyone have a suggestion on how to analyze
the following. The data consists of two data sets one being a control the other
representing implementation. A program was implemented to potentially increase attendance
in local schools. So each data set represents a list of schools and the attendance rate for each school. I took a
categorical approach to the data and used proc freq including the weight
command in order to generate a chi-sqre. I have a feeling my approach is incorrect
In a pedantic scholarly mode I would say that the analysis plan, the types of tests to be conducted with the data should have been decided upon before the data was collected.
The main thing to consider is what are you looking to compare? If it is a rate, then t-test for the mean rate between groups could be likely.
Chi-square would tell you if the distribution of responses was similar. Which works much better with categories than the almost certain different rate for each school.
I would recommend structuring your data into a single data set with a variable to indicate source, control or test, and start with tests of normality to see if t-test or other approach is needed, likely if the sample number of schools is small.
Basic way to combine the data:
data combined;
set
ControlData (in=incontrol)
TestData
;
if incontrol then source='Control';
else source='Test';
run;
The Source variable could then be used as a grouping or class variable in many procedures.
I hope that the weight variable is the basic number of enrolled children in the school.
If the schools represent different populations, such as elementary, middle / junior high, high school it might be helpful to include that as a category as you may different results between the grades.
In a pedantic scholarly mode I would say that the analysis plan, the types of tests to be conducted with the data should have been decided upon before the data was collected.
The main thing to consider is what are you looking to compare? If it is a rate, then t-test for the mean rate between groups could be likely.
Chi-square would tell you if the distribution of responses was similar. Which works much better with categories than the almost certain different rate for each school.
I would recommend structuring your data into a single data set with a variable to indicate source, control or test, and start with tests of normality to see if t-test or other approach is needed, likely if the sample number of schools is small.
Basic way to combine the data:
data combined;
set
ControlData (in=incontrol)
TestData
;
if incontrol then source='Control';
else source='Test';
run;
The Source variable could then be used as a grouping or class variable in many procedures.
I hope that the weight variable is the basic number of enrolled children in the school.
If the schools represent different populations, such as elementary, middle / junior high, high school it might be helpful to include that as a category as you may different results between the grades.
Thank you ballardw for your response. The analysis plan was decided prior to collection but I am an intern and am detached from that process. I found myself questioning the proposed approach and will just leave it at that. Thank you again for your help!
It is not uncommon that something in the results from the original plan raises questions. Sometimes they are additional interesting results, sometimes they are flaws in the data collection. So don't be afraid to experiment, but start with the plan.
Other ways to slice the data for comparisons could be by some category of school size (total enrollment), urban/suburban/rural locations, if you can get a good poverty index (School Free and Reduced Lunch participation rates may be available), ethic make-up.
Of course slicing the data more ways simultaneously requires more sample so may not be practical.
Thank you again ballardw!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.