BookmarkSubscribeRSS Feed
ginak
Quartz | Level 8

Hello.

Suppose I have a data set that looks something like

ID  age_0  age_1  age_2  age_3  age_4  age_5  age_6  age_7...... age_18

1    1          1         1        1           0         0       0         0         ........0

2    0          0         0        1           1         1       0         0         ........0

3    1          1         0        1           1         1       1         0         ........0

.

.

.

14,000    0          0         0        0           0         0       1         1         ........1

Where age_0 through age_18 represents binary variables which record whether or not a person had a particular disease recur at each age.

(1) Suppose we want to know the age onset of the disease. So for ID #1 it would be at age_0, for ID#2 it would be age_3, for ID#3 it would be age_0, for ID #14,000 it would be age_6.

(2) Suppose we want to know the last age at which they had the disease. For ID #1 it would be age_3, for ID #2 it would be age_5 etc.

(3) Suppose we want to know the duration of the disease. This could have two meanings:

          (a) It could be the # of groups of consecutive sequence of 1's (so for ID#1 it is 1, for ID #2 it is 1, for ID # 3 it is 2, for ID #14,000 it is 2).

          (b) If we are referring to the # of years they had the disease, it would just be a sum of the number of 1's for each person (for ID # 1 it is 4, for ID#2 it is 3, for ID#3 it is6, for ID#14,000 it is 12).

What would be the most efficient way to code this in SAS? I do not know how to use macros.. there has to be an easier way to do this Smiley Sad I can find the total number of years they had the illness by creating a summary score of the 1's. But I am not sure how to do the other 3 tasks. Help? I wanted to google this but I'm not even sure what I would type to look this up.

Thanks Smiley Happy

4 REPLIES 4
Patrick
Opal | Level 21

The keywords for a Google search could be "SAS array processing". You're right: no macro coding is needed.

Ksharp
Super User

You need so many new field .

data x;
input ID  age_0  age_1  age_2  age_3  age_4  age_5  age_6  age_7 ;
cards;
1    1          1         1        1           0         0       0         0  
2    0          0         0        1           1         1       0         0  
3    1          1         0        1           1         1       1         0  
;
run;
data want;
 set x;
 length onset  last_age $ 40;
 num_of_groups=0;
 array _a{*} age_: ;
 do i=1 to dim(_a);  
  if _a{i}=1 then last_age=vname(_a{i});
 end;
 do i=dim(_a) to 1 by -1;
  if _a{i}=1 then onset=vname(_a{i}); 
 end;

 num_of_groups=countc(strip(compbl(translate(cats('*',of _a{*},'*'),' ','1'))),' ') ;

 num_of_years=sum(of _a{*});
 drop i;
run;

Ksharp

Keith
Obsidian | Level 7

Modifying @Ksharp's code slightly, you could calculate all the required variables within a single loop of the array.

data want;

set x;

length onset  last_age $ 40;

num_of_groups=0;

num_of_years=0;

array _a{*} age_: ;

do i=1 to dim(_a);

  num_of_years+_a{i};

  if _a{i}=1 then do;

  if num_of_years=1 then onset=vname(_a{i});

  last_age=vname(_a{i});

  if i=1 or _a{i-1}=0 then num_of_groups+1;

  end;

end;

drop i;

run;

Haikuo
Onyx | Level 15

data have;

input ID:$  age_0  age_1  age_2  age_3  age_4  age_5  age_6  age_7;

cards;

1    1          1         1        1           0         0       0         0       

2    0          0         0        1           1         1       0         0       

3    1          1         0        1           1         1       1         0       

;

data want;

  set have;

    array age age_:;

    _si=dim(age);

    _ei=1;

        do _i=1 to  dim(age);

           if age(_i)=1 then do;

            _year=vname(age(_i));

                if _i<=_si then do;onset=_year;_si=min(_si,_i);end;

            if _i>=_ei then do; last=_year;_ei=max(_ei,_i);end;

             end;

        end;

       _cat=cats(of age(*));

       put _cat=;

       Duration_group=countw(translate(_cat,' ','0'));

       Duration_year=countc(_cat,'1');

       drop _:;

run;

proc print;run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 951 views
  • 5 likes
  • 5 in conversation