Creating a binary variable from the presence of consecutive values in a set of other variables

Reply
Occasional Contributor
Posts: 5

Creating a binary variable from the presence of consecutive values in a set of other variables

Hello I am not sure if someone could help me with the following:


I have a longitudinal dataset of a binary variable that  is equal to 1 when there is an infection in a patient, otherwise it is 0. Sometimes infections persist for seveal visits, sometimes they only remain for one visit and the patient is cured.I want to create 3 variables (X!, X2, X3) that would indicate a 1 or a 0 according to the following criteria:


In other words, this is how is might look and what I which to have my program do:

ID visit1  visit2  visit3  visit4      VARIABLE X1     VARIABLE X2         VARIABLE X3

1     0      0          0          0               0                              0                              0

2     0      1          0          0               0                              1                              0    

3     1     1           0          0               1                              1                              0

4     0      1          1          1                1                              1                             1

5    0       1          0          1               0                               1                              0


X1) Variable is = 1 when there is an infection for At least two or more consecutive visits per patient.

X2) Variable is = 1 when there is an infection for At least one visits per patient (Any infection).

X3) Variable is = 1 when there is an infection for At least three or more consecutive visits per patient.

Anyone can help me? I do not know how to solve this.


Thank you

Message was edited by: Ana Maria rodriguez

Grand Advisor
Posts: 10,210

Re: Creating a binary variable from the presence of consecutive values in a set of other variables

I think an example of your input and desired output might be in order.

Do you want the flag on the first of a series of consecutive, on the last or on all records of the sequence? ( relates to goals 1 and 3)

For goal 2 is the flag to be set against all records for the patient or a single one? If a single record which one? Or do you just need to know which patients have any infection ever?

Occasional Contributor
Posts: 5

Re: Creating a binary variable from the presence of consecutive values in a set of other variables

Thank you ballardw. I have changed my question so it is clearer. Thank you

Grand Advisor
Posts: 10,210

Re: Creating a binary variable from the presence of consecutive values in a set of other variables

I believe your original question said the data was in long form with one record per visit data and a flag for the infection.

Do you really want at final transposed dataset? How many dates are in the original data? Transposing into a single record per patient has the potential to generate 100s if not 1000s of variables.

Occasional Contributor
Posts: 5

Re: Creating a binary variable from the presence of consecutive values in a set of other variables

It is long form, I have just transformed it here in wide form to better illustrate what I want to do.

The number of visits per patient is 15.

Grand Advisor
Posts: 17,325

Re: Creating a binary variable from the presence of consecutive values in a set of other variables

Based on your data you can concatenate all the strings. Finding 11 in the string means that X1=1 and so forth. Simply replace the 4 in the code below with the 15 or how many ever variables you have.



data have;

input ID visit1  visit2  visit3  visit4 ;

cards;

1     0      0          0          0

2     0      1          0          0

3     1     1           0          0

4     0      1          1          1

5    0       1          0          1

;

run;

data want;

set have;

array visit(4) visit1-visit4;

string=cats(of visit1-visit4);

x1=0; x2=0; x3=0;

if find(string, '11')>0 then x1=1;

if find(string, '1')>0 then x2=1;

if find(string, '111')>0 then x3=1;

run;

Grand Advisor
Posts: 9,571

Re: Creating a binary variable from the presence of consecutive values in a set of other variables

Why not :

sum=sum(of visitSmiley Happy;

x1=ifn( sum=2,1,0 );

Respected Advisor
Posts: 3,775

Re: Creating a binary variable from the presence of consecutive values in a set of other variables

Ksharp wrote:

Why not :

sum=sum(of visit:);

x1=ifn( sum=2,1,0 );

I think the '1's need to be contiguous visits. Right?

Grand Advisor
Posts: 9,571

Re: Creating a binary variable from the presence of consecutive values in a set of other variables

Oh. You are right. I browsed too fast !

Occasional Contributor
Posts: 5

Re: Creating a binary variable from the presence of consecutive values in a set of other variables

Ksharp Does that mean that this suggestion will not work? Thanks for your input

Respected Advisor
Posts: 3,124

Re: Creating a binary variable from the presence of consecutive values in a set of other variables

I will stick to the solution offered by , as for a SAS solution, that is as slick as you can go.

Grand Advisor
Posts: 9,571

Re: Creating a binary variable from the presence of consecutive  values in a set of other variables

No. Actually it is very good .

Trusted Advisor
Posts: 1,203

Re: Creating a binary variable from the presence of consecutive values in a set of other variables

data want(drop=i);
set have;

array visit{*} v:;

variable1=0;
variable2=0;
variable3=0;

/* At least 2 consecutive visits */

do i=2 to dim(visit)-1;
if visit[i-1] ne 0 and visit ne 0 and
      visit=visit[i-1] then variable1=1;
end;

/* At least 1 consecutive visits */

do i=1 to dim(visit);
if visit > 0 then variable2=1;
end;

/* At least 3 consecutive visits */

do i=2 to dim(visit)-1;
if visit[i-1] ne 0 and visit ne 0 and visit[i+1] ne 0 and
      visit=visit[i-1] and visit=visit[i+1] then variable3=1;
end;

run;

Ask a Question
Discussion stats
  • 12 replies
  • 447 views
  • 0 likes
  • 7 in conversation