This is probably more of a statistics question than programming one but hopefully that's ok. I use SAS Enterprise Guide 7.15 and am searching for the best method(s) to use to conduct two hypothesis tests related to the distribution of a categorical variable in a population over time.
Using a simulated dataset (named SIM, provided at bottom), the first question I have is: Is this distribution of the categorical variable “stable” over the course of the first 12 time points? Is there a way to statistically answer that question?
Though I’m using the same simulated data set for the second question, it can be considered independently from the one above.
Let's say in the study from which this data set was drawn, there was a change applied on Jan. 1, 2022 - after 12 time points/halfway through the time series.
The question is: In what way(s), if any, is the distribution of the categorical variable different, or changing over time, in 2022?
By eyeball, we would maybe guess that the distribution of the four values is stable in the pre-intervention period, but changes over time in the post-intervention period – perhaps A and/or B decrease and C and/or D increase (in terms of proportions of total).
For this, I am not enthusiastic about a single chi-square test of homogeneity in which I aggregate 2021 and 2022 and analyze as a 2x4 contingency table. I have it in my head that interrupted time series could yield what we want – but I’m unsure because I’m most interested in being able to detect or describe the change in distribution rather than change in 1 individual categorical variable alone.
data SIM; input Month $ CategoricalVar $ Frequency; datalines; 2021-01 A 152 2021-01 B 289 2021-01 C 193 2021-01 D 103 2021-02 A 145 2021-02 B 250 2021-02 C 193 2021-02 D 101 2021-03 A 178 2021-03 B 312 2021-03 C 248 2021-03 D 117 2021-04 A 174 2021-04 B 309 2021-04 C 238 2021-04 D 135 2021-05 A 184 2021-05 B 339 2021-05 C 234 2021-05 D 116 2021-06 A 180 2021-06 B 340 2021-06 C 241 2021-06 D 113 2021-07 A 203 2021-07 B 370 2021-07 C 241 2021-07 D 109 2021-08 A 185 2021-08 B 345 2021-08 C 252 2021-08 D 134 2021-09 A 198 2021-09 B 333 2021-09 C 252 2021-09 D 130 2021-10 A 207 2021-10 B 378 2021-10 C 233 2021-10 D 127 2021-11 A 168 2021-11 B 298 2021-11 C 223 2021-11 D 127 2021-12 A 172 2021-12 B 308 2021-12 C 260 2021-12 D 127 2022-01 A 122 2022-01 B 290 2022-01 C 247 2022-01 D 144 2022-02 A 151 2022-02 B 287 2022-02 C 218 2022-02 D 107 2022-03 A 170 2022-03 B 316 2022-03 C 276 2022-03 D 162 2022-04 A 150 2022-04 B 325 2022-04 C 277 2022-04 D 119 2022-05 A 148 2022-05 B 289 2022-05 C 287 2022-05 D 134 2022-06 A 148 2022-06 B 238 2022-06 C 252 2022-06 D 154 2022-07 A 130 2022-07 B 258 2022-07 C 241 2022-07 D 153 2022-08 A 135 2022-08 B 235 2022-08 C 300 2022-08 D 140 2022-09 A 152 2022-09 B 229 2022-09 C 280 2022-09 D 172 2022-10 A 154 2022-10 B 330 2022-10 C 315 2022-10 D 187 2022-11 A 130 2022-11 B 278 2022-11 C 312 2022-11 D 179 2022-12 A 135 2022-12 B 267 2022-12 C 299 2022-12 D 175 ;
... View more