turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- how to create flag variables

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 11:48 AM

So I am a novice at SAS computations, yet have managed to run most of my statistics no problem until now; I have encountered interaction effects in my analysis. I am utilizing a weighted dataset and was told I need to create flag variables to stratify my data for sex and race. Hoping someone here can help me know how the coding should look? It also has a PSU, and a Stratum variable on top of the weight variable. I have no idea how to create flag variables though. I am also using logistic regression so I need to know how to create the flag variable and then run a logistic regression so that I can get odds ratios for each stratified varible(sex, and all the different races). If this helps I am also utilizing the CDC's YRBS, which is a free dataset available to all, if anyone here feels so inclined to attempt on their own.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 12:24 PM

Many SAS procedure don't need "flag variables", by which I assume you mean categorical levels of a variable (not numeric levels of a variable). So if you need to perform analyses by sex and race, these are handled in SAS by categorical variables.

For example, if you are using PROC GLM or PROC GLIMMIX (or many other procedures), you simply put the categorical variables into a CLASS statement.

So, is that what you mean? And what do you mean by PSU? (Pennsylvania State University? Or something else?)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 01:31 PM

So the dataset I'm using as three classification one is a weight variable, one is a psu "sampling frame unit" and the other is a stratum or cluster. So I understand what the coding looks like for logistic regression I understand there is a class statement and a model statement but I cant simply stratify the dataset as one normally would becuase its a weighted sample. Someone told me the only way to stratify a weighted sample is using something called a flag variable, which I think is similair to creating dummy variables, but again in theory I undestand but how to actually execute it into sas I am completely lost.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 01:39 PM

Are you looking at PROC SURVEYLOGISTIC vs LOGISTIC?

SURVEYLOGISTIC is designed to deal with weights and in general, survey data.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 01:41 PM

I am doing proc surveylogistic, because of the weight

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 01:46 PM - edited 04-25-2017 01:56 PM

j4sanford wrote:

I am doing proc surveylogistic, because of the weight

And PROC SURVEYLOGISTIC has a CLASS statement, and will handle weights from your survey properly.

Instead of being stingy with information, only letting us know you are using SURVEYLOGISTIC after several posts, please read my request from my previous message and describe the entire problem for us.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 01:44 PM

j4sanford wrote:

... but I cant simply stratify the dataset as one normally would becuase its a weighted sample

And not knowing your problem in detail, I would say you can do this with a weighted sample. Many procedures have a WEIGHT statement which should take care of the weighting properly. It would seem to me that PROC LOGISTIC with a CLASS statement and a WEIGHT statement would allow you to properly analyze the data.

But ... we have reached the point where you need to be very very very specific about what the actual problem is, state the model and the design of the study, so we can actually understand what you are doing. We need DETAILS, DETAILS, DETAILS, emphasis on the word "DETAILS". We need a complete and thorough explanation, emphasis on "COMPLETE" and "THOROUGH".

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 02:28 PM

Alright I guess its best to begin by stating my variables. All of them are dichotomized categorical variables. I have 5 exposure variables(plus two demographics-sex and race) so a total of 7, on 1 outcome variable. I've ran through the typical stuff: I ran my proc corr, chisq, proc surveylogistic of each exposure independently on the outcome, I then ran proc surveylogistic with the entire model(all five exposures plus demographics on my outcome) and my coding for the surveylogistic with the entire model looks like:

Proc surveylogsitic;

strata STRATUM;

cluster PSU;

weight WEIGHT;

class (I list all of my exposure variables) /param=reference;

model: outcome= expsoure variables;

run;

But my problem isnt getting the odds ratios of the entire model its that I then checked for interaction effects, and found interaction effects exist among two demographic variables: sex and race. So normally to tease out the interaction odds ratios, one would use an " if then" statement correct? but I was told that since I am using a weighted dataset, one simply cannot do this and must use a flag variable to create dummy variables or use LSMestimates to get odds ratios of each exposure variable on the outcome variable stratified by sex, and then stratified by race. But I really am just barely getting the hang of coding in SAS for all the other tests and not sure how one creates dummy variables, let alone how to do it with a weighted dataset. So really my question is, now that I have discovered an interaction, how do I re-run my logistic regression so that I get an odds ratio of each of my exposure variables on my outcome variable for females, for males, for whites, for african americans, for hispanics, etc...

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 02:59 PM - edited 04-25-2017 02:59 PM

Interactions between sex and race? or interaction between exposure and sex and other interactions between exposure and race? Which?

In either case, the MODEL statement in PROC SURVEYLOGISTIC can handle interactions, for example, if you want interaction between exposure variable #1 and sex, you add the term EXPOSURE1*SEX into the model (where EXPOSURE1 should be replaced by the actual name of the exposure variable).

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 03:19 PM

apologies, the interaction was found between one of my exposure variables and sex and the other interaction was found between another exposure variable and race.

I already ran the interaction and it's identified, but then how do I get my odds ratios stratified by the interaction variables??/

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 03:47 PM

Have you tried the OddsRatio Statement?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 03:59 PM

I dont think so..what does the coding for that look like?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 04:52 PM

j4sanford wrote:

apologies, the interaction was found between one of my exposure variables and sex and the other interaction was found between another exposure variable and race.

I already ran the interaction and it's identified, but then how do I get my odds ratios stratified by the interaction variables??/

You are unlikely to find an "automagic" way if that is what you are looking for. The * is used to indicate/request interactions.

model a = var1 var2 var3 var1*var3 var2*var3 ;

would request the interactions of var1 and var2 with var3 for example.

Nested effect requests use ()

model a = var1 var2 var2(var1); would request var2 nested within var1.

Lots of combination possible such as D*E(A*B*C).

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 06:09 PM

yea I dont think you are understanding my problem, I have identified were the interaction lies. I dont need to test for interactions anymore, but what is the next step once an interaction is found? the data must be stratified by the variable that is showing interaction to tease out the difference. It indicates that a confound exist, or that the odds ratio differs depending on the dichotmized variable

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2017 01:29 PM