BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Aves9019
Obsidian | Level 7

Excuse my ignorance I’m new to SAS. Anyone have a suggestion on how to analyze
the following. The data consists of two data sets one being a control the other
representing implementation. A program was implemented to potentially increase attendance
in local schools. So each data set represents a list of schools and the attendance rate for each school. I took a
categorical approach to the data and used proc freq including the weight
command in order to generate a chi-sqre. I have a feeling my approach is incorrect

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

In a pedantic scholarly mode I would say that the analysis plan, the types of tests to be conducted with the data should have been decided upon before the data was collected.

The main thing to consider is what are you looking to compare? If it is a rate, then t-test for the mean rate between groups could be likely.

Chi-square would tell you if the distribution of responses was similar. Which works much better with categories than the almost certain different rate for each school.

I would recommend structuring your data into a single data set with a variable to indicate source, control or test, and start with tests of normality to see if t-test or other approach is needed, likely if the sample number of schools is small.

Basic way to combine the data:

data combined;

     set

          ControlData (in=incontrol)

          TestData

     ;

     if incontrol then source='Control';

     else source='Test';

run;

The Source variable could then be used as a grouping or class variable in many procedures.

I hope that the weight variable is the basic number of enrolled children in the school.

If the schools represent different populations, such as elementary, middle / junior high, high school it might be helpful to include that as a category as you may different results between the grades.

View solution in original post

4 REPLIES 4
ballardw
Super User

In a pedantic scholarly mode I would say that the analysis plan, the types of tests to be conducted with the data should have been decided upon before the data was collected.

The main thing to consider is what are you looking to compare? If it is a rate, then t-test for the mean rate between groups could be likely.

Chi-square would tell you if the distribution of responses was similar. Which works much better with categories than the almost certain different rate for each school.

I would recommend structuring your data into a single data set with a variable to indicate source, control or test, and start with tests of normality to see if t-test or other approach is needed, likely if the sample number of schools is small.

Basic way to combine the data:

data combined;

     set

          ControlData (in=incontrol)

          TestData

     ;

     if incontrol then source='Control';

     else source='Test';

run;

The Source variable could then be used as a grouping or class variable in many procedures.

I hope that the weight variable is the basic number of enrolled children in the school.

If the schools represent different populations, such as elementary, middle / junior high, high school it might be helpful to include that as a category as you may different results between the grades.

Aves9019
Obsidian | Level 7

Thank you ballardw for your response. The analysis plan was decided prior to collection but I am an intern and am detached from that process. I found myself questioning the proposed approach and will just leave it at that. Thank you again for your help!

ballardw
Super User

It is not uncommon that something in the results from the original plan raises questions. Sometimes they are additional interesting results, sometimes they are flaws in the data collection. So don't be afraid to experiment, but start with the plan.

Other ways to slice the data for comparisons could be by some category of school size (total enrollment), urban/suburban/rural locations, if you can get a good poverty index (School Free and Reduced Lunch participation rates may be available), ethic make-up.

Of course slicing the data more ways simultaneously requires more sample so may not be practical.

Aves9019
Obsidian | Level 7

Thank you again ballardw!

sas-innovate-2024.png

📢

ANNOUNCEMENT

The early bird rate has been extended! Register by March 18 for just $695 - $100 off the standard rate.

 

Check out the agenda and get ready for a jam-packed event featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events. 

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1301 views
  • 2 likes
  • 2 in conversation