DATA Step, Macro, Functions and more

Creating indicator variables that are combinations of other indicators

Reply
New Contributor
Posts: 4

Creating indicator variables that are combinations of other indicators

I have a set of seven dichotomous indicator variables:

Step_1

Step_2

Step_3

Step_4

Step_5

Step_6

Step_7

where each variable is set to 1 if the respondent is exposed to the Step and 0 if the respondent is not exposed to the Step.

Exposure is NOT exclusive so that a respondent can be exposed to both Step_1 and Step_2.

I want to create indicator variables that are for various combinations of

(7, 2) e.g., Step_1 and Step_2;                  Step_1 and Step_3;            Step_1 and Step_4; . . .  Step_6; and Step_7.

(7, 3) e.g., Step_1, Step_2, and Step_3;    Step_1, Step_2, Step_4;     ...                                  Step_5, Step_6, and Step_7

(7, 4)

(7. 5)

(7, 6)

I am attempting to create 2 types of indicator variables:

1) the combination is exlusive such that for each combination the data would look like:

Step_1=1  AND  Step_2=1  AND STEP_3=0  AND STEP_4=0  AND STEP_5=0  AND STEP_6=0  AND STEP_7=0

and

2) the combination is not exclusive so that I just want to see if the respondent is exposed to the 2 steps of interest irregardless of whether or not the respondent is exposed to other Steps: e.g.,

all that matters is that

Step_1=1 AND STEP_2=1... the other STEPs do not matter.

I imagine I would need to use the CALL LEXCOMB command or another command similar. However, I am unsure how to execute it to call up the various combinations and generate a new variable for each.

PROC Star
Posts: 7,492

Creating indicator variables that are combinations of other indicators

Are you sure you have asked for what you really want to achieve?

It's doable, of course, whether it be with call lexcomb or, more likely, simply two nested do loops.  However, each of your seven variables can have a value of 1 or 0, thus you will need quite a number of variables.

What are you hoping to actually achieve?

New Contributor
Posts: 4

Creating indicator variables that are combinations of other indicators

Thank you for responding art297. I appreciate your focusing questions "Are you sure you have asked for what you really want to achieve?" and "What are you hoping to actually achieve?"

Just a quick background:

Each of the Steps I've referenced above are a different hospital-based policy. Currently, hospitals will only receive recognition for having all Steps in place (it is an all or nothing deal. If you have all Steps, you get credit; if you're missing even just 1 Step, you get 0 credit). Having all Steps in place is associated with improved health outcomes. However, it is a huge barrier to have ALL Steps in place. There is reserach out there to suggest that increased numbers of Steps in place is associated with improved health (e.g., All steps is better than 6 Steps which is better than 5 Steps which is better than 4 Steps, etc...).

Some States are starting programs where they will recognize hospitals for each additional 2 Steps they have in place. That is, a hosptial will recieve 1 star for having 2 Steps in place, 2 stars for having 4 Steps in place, 3 stars for having 6 Steps in place. The problem, though, is we don't know which combinations of 2 Steps to prioritize, meaning are there certain combinations of 2 steps that have a larger impact than other combinations of 2 Steps?

Our research question then is, "Which combination of 2 Steps is associated with the greatest improvement in health, which combination of 2 Steps is associated with the 2nd greatest improvement in health, which combination is associated with the 3rd greatest and so forth.?" This informaton is to be used by these State programs so they can tell hospitals which Steps to prioritize (meaning which combinations of 2 Steps give the biggest bang in terms of health improvement).

We're using a potential outcomes framework: so having two steps in place as compared with having zero steps in place. And which combination of 2 gives the greatest improvement over 0 Steps in place.

We want the other combinations so that we could then say ok, once you have these 2 Steps in place (say Step 4 and Step 7 have the biggest impact), then you should go for this next combination of 2 (say Step 1 and Step 3).

So, we think that what we want is the various combinations of 2 Steps, but we could be wrong.

PROC Star
Posts: 7,492

Creating indicator variables that are combinations of other indicators

I would post your request on the stat forum.  I would think that, without any/much playing with the data, it can just be analyzed with one of the regression models (e.g., proc logistic).

New Contributor
Posts: 4

Re: Creating indicator variables that are combinations of other indicators

Unfortunately, a regression model (e.g., proc logistic) really doesn't model the potential outcomes from a causal inference framework (thinking in terms of say a propensity score anlaysis versus a standard regression). We want to create a set of treatment variables:

exposed to combination1 vs. not  exposed to combination1

exposed to combinatino2 vs. not exposed to combination2

exposed to combination3 vs. not exposed to combination3

to see which has the largest impact in combintation.

We can interact Step Exposure in a nested do loop to create combinations for situation (2) but we're still facing a challenge for situation (1).

Respected Advisor
Posts: 3,799

Creating indicator variables that are combinations of other indicators

I'm a bit confused by all this but this will give you some indicator variables of the 7 choose 2,3,4,5,6,7 variables step1-step7.

options ls=132 ps=60 center=0;

proc plan seed=1702547292;

   factors id=30 ordered

      step1=1 of 2

      step2=1 of 2

      step3=1 of 2

      step4=1 of 2

      step5=1 of 2

      step6=1 of 2

      step7=1 of 2

      y    =1 of 1000 / noprint;

   output out=step

      step1 nvals=(0 1)

      step2 nvals=(0 1)

      step3 nvals=(0 1)

      step4 nvals=(0 1)

      step5 nvals=(0 1)

      step6 nvals=(0 1)

      step7 nvals=(0 1);

   run;

proc print;

   run;

proc glmmod namelen=200 outdesign=design outparm=parm;

   model y = step1|step2|step3|step4|step5|step6|step7@6;

   run;

Proc print data=parm;

proc contents data=design varnum;

proc print data=design;

   run;

Valued Guide
Posts: 2,177

Creating indicator variables that are combinations of other indicators

Creating a combination could be as easy as

 Combination = cats( of step_1-step_10);

You can refer to a combination of two only like :

Where Combination ='1100000000';

to select those successful on only steps 1 and 2. 

To select those unsuccessful only on any two steps 

where compress(Combination,'1')='00';

To select those successful on step 5 and at least one other

Where substr(Combination,5,1)="1" & compress(Combination,"0")=:"11");

Super User
Posts: 10,044

Creating indicator variables that are combinations of other indicators

If I understood what your mean. you can use arrary to loop to get the result. For example.

For (7, 2)

You can code like:

array state{7} state1-state7;

do i=1 to 7;

do j=i to 7;

var=state{i}||state{j};

end;

end;

Ksharp

Ask a Question
Discussion stats
  • 7 replies
  • 209 views
  • 0 likes
  • 5 in conversation