BookmarkSubscribeRSS Feed
SAS93
Quartz | Level 8

I'm working with complex survey data to get annual estimates for a particular dichotomous variable; I want to estimate if there's a trend in prevalence over time.

 

I'm planning to:

  1. merge each year's dataset together
  2. create a categorical variable for all the years (my x variable)
  3. run logistic regression using the dichotomous variable as y.

Right now, I don't plan on using any extra control variables like race or age--I'll add those in later if I need to. 

 

I'm still fairly new to certain parts of working with complex survey data. Is there anything in particular I need to be sure to add into my code? 

 

Currently, it looks like this:

 

PROC SURVEYREG Data=trends Total=TOTALS NOMCAR;          
 Weight weightvar;
 Strata stratavar;
   Class year (reference='2013');                    *2013 = beginning;
   Model y (ref='0') = year / vadjust=none;      *Modeling probability for 'yes' outcome';
   lsmeans year / diff;
 Format year year.;
 Run;

6 REPLIES 6
SteveDenham
Jade | Level 19

The code you have will let you know if any of the years differ.  To get at trends, you will need to implement either a CONTRAST statement or an LSMESTIMATE statement.  I don't know how many years are in the model, but assume that there are data for 2013, 2014, 2015, 2016, 2017, 2018 and 2019 (7 years).

 

Using LSMESTIMATE:

 

lsmestimate year 'Linear time' -3 -2 -1 0 1 2 3;

/* if you also wanted to look for both a linear and a quadratic trend it would be
lsmestimate year 'Linear time' -3-2-1 0 1 2 3,
                            'Quadratic time' 5 0 -3 -4 -3 0 5;

Alternatively, you could fit time as a continuous variable by deleting the CLASS year statement.

 

SteveDenham

 

ChuksManuel
Pyrite | Level 9
Hi Steve,

This thread was helpful. I completely understand your explaination on this comment. I am implementing a similar trend analysis but with 12 year cycles from 2008 to 2019.
Would the contrast be something like this ?
lsmestimate year 'Linear time'-5 -4 -3 -2 -1 0 1 2 3 4 5 6;
What would the quadratic time be? I will appreciate a response.
SAS_Rob
SAS Employee

The best way to come up with the coefficients for the orthogonal polynomials is to use the ORPOL function in IML.

SAS Help Center: ORPOL Function

 

 

 

 

SteveDenham
Jade | Level 19

Some key things to note for trend tests using orthogonal polynomials.  The coefficients should sum to zero, so for a 12-nomial linear contrast the following would work: -11 -9 -7 -5 -3 -1 1 3 5 7 9 11. If you are interested in the value of the trend, this should use a divisor=2 option.  Quadratic form is relatively easy to calculate by hand - square each entry, get the average of those, subtract the average from each squared value, and reduce to lowest terms. So, 121 + 81 +49 + 25 + 9 + 1 +1 + 9 + 25 + 49 + 81 + 121 = 572, average = 572/12 = 47.66667 (=47 2/3), which gives 73 1/3  33 1/3 1 1/3 -22 2/3 -38 2/3 -46 2/3  -46 2/3 -38 2/3 -22 2/3 1 1/3 33 1/3 73 1/3  --> 220 100 4 -68 -116 -140 -140 -116 -68 4 100 220 with divisor= 3 option. These could all be divided by 4 to get to lowest terms, but if you are interested in the actual value, I suggest stopping here.  You can check these against the results from the ORPOL function in IML that @SAS_Rob mentioned.

 

SteveDenham

StatDave
SAS Super FREQ
I assume, and hope, that you are really using PROC SURVEYLOGISTIC and not PROC SURVEYREG which assumes that the response is normally distributed.
ChuksManuel
Pyrite | Level 9
Yes. I am using proc survey logistic for this.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1723 views
  • 3 likes
  • 5 in conversation