I'm working with complex survey data to get annual estimates for a particular dichotomous variable; I want to estimate if there's a trend in prevalence over time.
I'm planning to:
Right now, I don't plan on using any extra control variables like race or age--I'll add those in later if I need to.
I'm still fairly new to certain parts of working with complex survey data. Is there anything in particular I need to be sure to add into my code?
Currently, it looks like this:
PROC SURVEYREG Data=trends Total=TOTALS NOMCAR;
Weight weightvar;
Strata stratavar;
Class year (reference='2013'); *2013 = beginning;
Model y (ref='0') = year / vadjust=none; *Modeling probability for 'yes' outcome';
lsmeans year / diff;
Format year year.;
Run;
The code you have will let you know if any of the years differ. To get at trends, you will need to implement either a CONTRAST statement or an LSMESTIMATE statement. I don't know how many years are in the model, but assume that there are data for 2013, 2014, 2015, 2016, 2017, 2018 and 2019 (7 years).
Using LSMESTIMATE:
lsmestimate year 'Linear time' -3 -2 -1 0 1 2 3;
/* if you also wanted to look for both a linear and a quadratic trend it would be
lsmestimate year 'Linear time' -3-2-1 0 1 2 3,
'Quadratic time' 5 0 -3 -4 -3 0 5;
Alternatively, you could fit time as a continuous variable by deleting the CLASS year statement.
SteveDenham
The best way to come up with the coefficients for the orthogonal polynomials is to use the ORPOL function in IML.
SAS Help Center: ORPOL Function
Some key things to note for trend tests using orthogonal polynomials. The coefficients should sum to zero, so for a 12-nomial linear contrast the following would work: -11 -9 -7 -5 -3 -1 1 3 5 7 9 11. If you are interested in the value of the trend, this should use a divisor=2 option. Quadratic form is relatively easy to calculate by hand - square each entry, get the average of those, subtract the average from each squared value, and reduce to lowest terms. So, 121 + 81 +49 + 25 + 9 + 1 +1 + 9 + 25 + 49 + 81 + 121 = 572, average = 572/12 = 47.66667 (=47 2/3), which gives 73 1/3 33 1/3 1 1/3 -22 2/3 -38 2/3 -46 2/3 -46 2/3 -38 2/3 -22 2/3 1 1/3 33 1/3 73 1/3 --> 220 100 4 -68 -116 -140 -140 -116 -68 4 100 220 with divisor= 3 option. These could all be divided by 4 to get to lowest terms, but if you are interested in the actual value, I suggest stopping here. You can check these against the results from the ORPOL function in IML that @SAS_Rob mentioned.
SteveDenham
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.