BookmarkSubscribeRSS Feed
j4sanford
Calcite | Level 5

So I am a novice at SAS computations, yet have managed to run most of my statistics no problem until now;  I have encountered interaction effects in my analysis. I am utilizing a weighted dataset and was told I need to create flag variables to stratify my data for sex and race. Hoping someone here can help me know how the coding should look? It also has a PSU, and a Stratum variable on top of the weight variable. I have no idea how to create flag variables though. I am also using logistic regression so I need to know how to create the flag variable and then run a logistic regression so that I can get odds ratios for each stratified varible(sex, and all the different races). If this helps I am also utilizing the CDC's YRBS, which is a free dataset available to all, if anyone here feels so inclined to attempt on their own.

18 REPLIES 18
PaigeMiller
Diamond | Level 26

Many SAS procedure don't need "flag variables", by which I assume you mean categorical levels of a variable (not numeric levels of a variable). So if you need to perform analyses by sex and race, these are handled in SAS by categorical variables.

 

For example, if you are using PROC GLM or PROC GLIMMIX (or many other procedures), you simply put the categorical variables into a CLASS statement.

 

So, is that what you mean? And what do you mean by PSU? (Pennsylvania State University? Or something else?)

--
Paige Miller
j4sanford
Calcite | Level 5

So the dataset I'm using as three classification one is a weight variable, one is a psu "sampling frame unit" and the other is a stratum or cluster. So I understand what the coding looks like for logistic regression I understand there is a class statement and a model statement but I cant simply stratify the dataset as one normally would becuase its a weighted sample. Someone told me the only way to stratify a weighted sample is using something called a flag variable, which I think is similair to creating dummy variables, but again in theory I undestand but how to actually execute it into sas I am completely lost.

Reeza
Super User

Are you looking at PROC SURVEYLOGISTIC vs LOGISTIC?

SURVEYLOGISTIC is designed to deal with weights and in general, survey data.

j4sanford
Calcite | Level 5

I am doing proc surveylogistic, because of the weight

PaigeMiller
Diamond | Level 26

@j4sanford wrote:

I am doing proc surveylogistic, because of the weight


And PROC SURVEYLOGISTIC has a CLASS statement, and will handle weights from your survey properly.

 

Instead of being stingy with information, only letting us know you are using SURVEYLOGISTIC after several posts, please read my request from my previous message and describe the entire problem for us.

--
Paige Miller
PaigeMiller
Diamond | Level 26

@j4sanford wrote:

... but I cant simply stratify the dataset as one normally would becuase its a weighted sample


 

And not knowing your problem in detail, I would say you can do this with a weighted sample. Many procedures have a WEIGHT statement which should take care of the weighting properly. It would seem to me that PROC LOGISTIC with a CLASS statement and a WEIGHT statement would allow you to properly analyze the data.

 

But ... we have reached the point where you need to be very very very specific about what the actual problem is, state the model and the design of the study, so we can actually understand what you are doing. We need DETAILS, DETAILS, DETAILS, emphasis on the word "DETAILS". We need a complete and thorough explanation, emphasis on "COMPLETE" and "THOROUGH".

--
Paige Miller
j4sanford
Calcite | Level 5

Alright I guess its best to begin by stating my variables.  All of them are dichotomized categorical variables. I have 5 exposure variables(plus two demographics-sex and race) so a total of 7, on 1 outcome variable.  I've ran through the typical stuff: I ran my proc corr, chisq, proc surveylogistic of each exposure independently on the outcome, I then ran proc surveylogistic with the entire model(all five exposures plus demographics on my outcome) and my coding for the surveylogistic with the entire model looks like:

 

Proc surveylogsitic;

strata STRATUM;

cluster PSU;

weight WEIGHT;

class (I list all of my exposure variables) /param=reference;

model: outcome= expsoure variables;

run;

 

But my problem isnt getting the odds ratios of the entire model its that I then checked for interaction effects, and found interaction effects exist among two demographic variables: sex and race. So normally to  tease out the interaction odds ratios, one would use an " if then" statement correct? but I was told that since I am using a weighted dataset, one simply cannot do this and must use a flag variable to create dummy variables or use LSMestimates to get odds ratios of each exposure variable on the outcome variable stratified by sex, and then stratified by race. But I really am just barely getting the hang of coding in SAS for all the other tests and not sure how one creates dummy variables, let alone how to do it with a weighted dataset. So really my question is, now that I have discovered an interaction, how do I re-run my logistic regression so that I get an odds ratio of each of my exposure variables on my outcome variable for females, for males, for whites, for african americans, for hispanics, etc...

 

PaigeMiller
Diamond | Level 26

Interactions between sex and race? or interaction between exposure and sex and other interactions between exposure and race? Which?

 

In either case, the MODEL statement in PROC SURVEYLOGISTIC can handle interactions, for example, if you want interaction between exposure variable #1 and sex, you add the term EXPOSURE1*SEX into the model (where EXPOSURE1 should be replaced by the actual name of the exposure variable).

--
Paige Miller
j4sanford
Calcite | Level 5

apologies, the interaction was found between one of my exposure variables and sex and the other interaction was found between another exposure variable and race.

 

I already ran the interaction and it's identified, but then how do I get my odds ratios stratified by the interaction variables??/

Reeza
Super User

Have you tried the OddsRatio Statement?

j4sanford
Calcite | Level 5

I dont think so..what does the coding for that look like?

ballardw
Super User

@j4sanford wrote:

apologies, the interaction was found between one of my exposure variables and sex and the other interaction was found between another exposure variable and race.

 

I already ran the interaction and it's identified, but then how do I get my odds ratios stratified by the interaction variables??/


You are unlikely to find an "automagic" way if that is what you are looking for. The * is used to indicate/request interactions.

 

model a = var1 var2 var3 var1*var3 var2*var3 ;

would request the interactions of var1 and var2 with var3 for example.

Nested effect requests use ()

model a = var1 var2 var2(var1); would request var2 nested within var1.

Lots of combination possible such as D*E(A*B*C).

j4sanford
Calcite | Level 5

yea I dont think you are understanding my problem, I have identified were the interaction lies. I dont need to test for interactions anymore, but what is the next step once an interaction is found? the data must be stratified by the variable that is showing interaction to tease out the difference. It indicates that a confound exist, or that the odds ratio differs depending on the dichotmized variable

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 18 replies
  • 6836 views
  • 2 likes
  • 5 in conversation