## How to compute dichotomous variables from condition

Solved
Occasional Contributor
Posts: 16

# How to compute dichotomous variables from condition

Hello everyone!

I've been struggling for a few days to solve the this task.

The setting/study

- Observational data. Patients with Crohns Disease. Data was collected annually during 2002–2013 in a large region in Sweden.

- Patients can be included any year and visits may be irregular on a annual basis (some patients might visit the clinic every year, other come only one time during the study period).

- I know the exact day of death for each patient. VARIABLE: DEATH_YEAR

- I know the exact day of relapse (one of the endpoints of interest). VARIABLE: RELAPSE_YEAR

I am interested in the incidence of relapse and I need to calculate the number of relapses each year divided by the number of individuals alive that year. Now the problem is that from inclusion, individuals come irregularly, but I do know if they are actually alive that year and if they have experienced a relapse.

I could solve this (if I only could...) if I could create 12 new variables for each patient. Each new variable should be the calendar year and this variable should be set to '1' if the patient is alive that year.

Thus the problem is that i need to create a 'year-variables' that are set to '1' for each year at inclusion and thereafter, given that the person is not dead, or has experienced the event.

An example:

Patient X was included 2005 and died 2009. For him I would need he following variables: '2005', '2006', '2007', '2008' and '2009' set to '1'.

Patient Y was included 2005 and experienced event 2007. For him I would need the following variables: '2005', '2006', 2007' set to '1'. (Yes, year of event/death need still be set to '1').

I would be extremely grateful for any advice on this!

Accepted Solutions
Solution
‎06-28-2014 02:28 PM
Super User
Posts: 23,773

## Re: How to compute dichotomous variables from condition

The code requires some minor modifications.

1. Change the start year to your start year variable, first_visit

2. Remove the initializing of the variable, so it isn't set to 0 by default.

3. Calculate the end of the loop/indicator by taking the minimum of 2013, the relapse year and death year.

data want;

set have;

array year_indicator(2002:2013) year2002-year2013; *create an array that uses the index 2002 to 2013 and is initialized to 0;

year_end=min(2012, relapse_year, death_year);

do year=first_visit to year_end;

year_indicator(year)=1;

end;

run;

All Replies
Super User
Posts: 23,773

## Re: How to compute dichotomous variables from condition

Arrays are your friend. I think this gets you closer but not 100% sure what you're after, its often easiest if you post a small sample of what your data looks like and what the output should look like.

data want;

set have;

array year_indicator(2002:2013) year2002-year2013 (0*12); *create an array that uses the index 2002 to 2013 and is initialized to 0;

do year=relapse_year to death_year;

year_indicator(year)=1;

end;

run;

Occasional Contributor
Posts: 16

## Re: How to compute dichotomous variables from condition

Thank You Reeza for Your effort.

Unfortunately it seems that I did not explain the task good enough.

I tried Your array, and also modify it in several ways, without any success but thats probably just me.

Here is an example data set with three patients.

data have;

input patient visit first_visit relapse_year death_year;

datalines;

1 2003 2003 . 2010

1 2004 2003 . 2010

1 2009 2003 . 2010

2 2002 2002 2006 .

2 2006 2002 2006 .

2 2006 2002 2006 .

2 2008 2002 2006 .

2 2012 2002 2006 .

3 2004 2004 . .

3 2008 2004 . .

3 2008 2004 . .

;

run;

The year variable, which You have called year2002-2012 should be set to '1' between the patients first visit year and the last year; the last year is either 2012 (if patient did not die, nor experienced event) or the year of death/event. The array should therefore lead to following results (computed variables are bold):

 patient visit first_visit relapse_year death_year YEAR2002 YEAR2003 YEAR2004 YEAR2005 YEAR2006 YEAR2007 YEAR2008 YEAR2009 YEAR2010 YEAR2011 YEAR2012 1 2003 2003 . 2010 . 1 1 1 1 1 1 1 1 . . 1 2004 2003 . 2010 . 1 1 1 1 1 1 1 1 . . 1 2009 2003 . 2010 . 1 1 1 1 1 1 1 1 . . 2 2002 2002 2006 . 1 1 1 1 1 . . . . . . 2 2006 2002 2006 . 1 1 1 1 1 . . . . . . 2 2006 2002 2006 . 1 1 1 1 1 . . . . . . 2 2008 2002 2006 . 1 1 1 1 1 . . . . . . 2 2012 2002 2006 . 1 1 1 1 1 . . . . . . 3 2004 2004 . . . . 1 1 1 1 1 1 1 1 1 3 2008 2004 . . . . 1 1 1 1 1 1 1 1 1 3 2008 2004 . . . . 1 1 1 1 1 1 1 1 1

Solution
‎06-28-2014 02:28 PM
Super User
Posts: 23,773

## Re: How to compute dichotomous variables from condition

The code requires some minor modifications.

1. Change the start year to your start year variable, first_visit

2. Remove the initializing of the variable, so it isn't set to 0 by default.

3. Calculate the end of the loop/indicator by taking the minimum of 2013, the relapse year and death year.

data want;

set have;

array year_indicator(2002:2013) year2002-year2013; *create an array that uses the index 2002 to 2013 and is initialized to 0;

year_end=min(2012, relapse_year, death_year);

do year=first_visit to year_end;

year_indicator(year)=1;

end;

run;

Occasional Contributor
Posts: 16

## Re: How to compute dichotomous variables from condition

That solved the problem nicely.

What a fantastic way to solve this problem; my data step (4 pages of code...) would not have impressed You.

Thanks again.

🔒 This topic is solved and locked.