Contributor
Posts: 71

# Calculating onset of an experience, or the last experience, or duration (coding question)

Hello.

Suppose I have a data set that looks something like

ID  age_0  age_1  age_2  age_3  age_4  age_5  age_6  age_7...... age_18

1    1          1         1        1           0         0       0         0         ........0

2    0          0         0        1           1         1       0         0         ........0

3    1          1         0        1           1         1       1         0         ........0

.

.

.

14,000    0          0         0        0           0         0       1         1         ........1

Where age_0 through age_18 represents binary variables which record whether or not a person had a particular disease recur at each age.

(1) Suppose we want to know the age onset of the disease. So for ID #1 it would be at age_0, for ID#2 it would be age_3, for ID#3 it would be age_0, for ID #14,000 it would be age_6.

(2) Suppose we want to know the last age at which they had the disease. For ID #1 it would be age_3, for ID #2 it would be age_5 etc.

(3) Suppose we want to know the duration of the disease. This could have two meanings:

(a) It could be the # of groups of consecutive sequence of 1's (so for ID#1 it is 1, for ID #2 it is 1, for ID # 3 it is 2, for ID #14,000 it is 2).

(b) If we are referring to the # of years they had the disease, it would just be a sum of the number of 1's for each person (for ID # 1 it is 4, for ID#2 it is 3, for ID#3 it is6, for ID#14,000 it is 12).

What would be the most efficient way to code this in SAS? I do not know how to use macros.. there has to be an easier way to do this I can find the total number of years they had the illness by creating a summary score of the 1's. But I am not sure how to do the other 3 tasks. Help? I wanted to google this but I'm not even sure what I would type to look this up.

Thanks

Posts: 4,736

## Re: Calculating onset of an experience, or the last experience, or duration (coding question)

The keywords for a Google search could be "SAS array processing". You're right: no macro coding is needed.

Super User
Posts: 10,770

## Re: Calculating onset of an experience, or the last experience, or duration (coding question)

You need so many new field .

```data x;
input ID  age_0  age_1  age_2  age_3  age_4  age_5  age_6  age_7 ;
cards;
1    1          1         1        1           0         0       0         0
2    0          0         0        1           1         1       0         0
3    1          1         0        1           1         1       1         0
;
run;
data want;
set x;
length onset  last_age \$ 40;
num_of_groups=0;
array _a{*} age_: ;
do i=1 to dim(_a);
if _a{i}=1 then last_age=vname(_a{i});
end;
do i=dim(_a) to 1 by -1;
if _a{i}=1 then onset=vname(_a{i});
end;

num_of_groups=countc(strip(compbl(translate(cats('*',of _a{*},'*'),' ','1'))),' ') ;

num_of_years=sum(of _a{*});
drop i;
run;
```

Ksharp

Regular Contributor
Posts: 151

## Re: Calculating onset of an experience, or the last experience, or duration (coding question)

Modifying @Ksharp's code slightly, you could calculate all the required variables within a single loop of the array.

data want;

set x;

length onset  last_age \$ 40;

num_of_groups=0;

num_of_years=0;

array _a{*} age_: ;

do i=1 to dim(_a);

num_of_years+_a{i};

if _a{i}=1 then do;

if num_of_years=1 then onset=vname(_a{i});

last_age=vname(_a{i});

if i=1 or _a{i-1}=0 then num_of_groups+1;

end;

end;

drop i;

run;

Posts: 3,167

## Re: Calculating onset of an experience, or the last experience, or duration (coding question)

data have;

input ID:\$  age_0  age_1  age_2  age_3  age_4  age_5  age_6  age_7;

cards;

1    1          1         1        1           0         0       0         0

2    0          0         0        1           1         1       0         0

3    1          1         0        1           1         1       1         0

;

data want;

set have;

array age age_:;

_si=dim(age);

_ei=1;

do _i=1 to  dim(age);

if age(_i)=1 then do;

_year=vname(age(_i));

if _i<=_si then do;onset=_year;_si=min(_si,_i);end;

if _i>=_ei then do; last=_year;_ei=max(_ei,_i);end;

end;

end;

_cat=cats(of age(*));

put _cat=;

Duration_group=countw(translate(_cat,' ','0'));

Duration_year=countc(_cat,'1');

drop _:;

run;

proc print;run;

Discussion stats
• 4 replies
• 195 views
• 5 likes
• 5 in conversation