BookmarkSubscribeRSS Feed
ginak
Quartz | Level 8

Hello.

Suppose I have a data set that looks something like

ID  age_0  age_1  age_2  age_3  age_4  age_5  age_6  age_7...... age_18

1    1          1         1        1           0         0       0         0         ........0

2    0          0         0        1           1         1       0         0         ........0

3    1          1         0        1           1         1       1         0         ........0

.

.

.

14,000    0          0         0        0           0         0       1         1         ........1

Where age_0 through age_18 represents binary variables which record whether or not a person had a particular disease recur at each age.

(1) Suppose we want to know the age onset of the disease. So for ID #1 it would be at age_0, for ID#2 it would be age_3, for ID#3 it would be age_0, for ID #14,000 it would be age_6.

(2) Suppose we want to know the last age at which they had the disease. For ID #1 it would be age_3, for ID #2 it would be age_5 etc.

(3) Suppose we want to know the duration of the disease. This could have two meanings:

          (a) It could be the # of groups of consecutive sequence of 1's (so for ID#1 it is 1, for ID #2 it is 1, for ID # 3 it is 2, for ID #14,000 it is 2).

          (b) If we are referring to the # of years they had the disease, it would just be a sum of the number of 1's for each person (for ID # 1 it is 4, for ID#2 it is 3, for ID#3 it is6, for ID#14,000 it is 12).

What would be the most efficient way to code this in SAS? I do not know how to use macros.. there has to be an easier way to do this Smiley Sad I can find the total number of years they had the illness by creating a summary score of the 1's. But I am not sure how to do the other 3 tasks. Help? I wanted to google this but I'm not even sure what I would type to look this up.

Thanks Smiley Happy

4 REPLIES 4
Patrick
Opal | Level 21

The keywords for a Google search could be "SAS array processing". You're right: no macro coding is needed.

Ksharp
Super User

You need so many new field .

data x;
input ID  age_0  age_1  age_2  age_3  age_4  age_5  age_6  age_7 ;
cards;
1    1          1         1        1           0         0       0         0  
2    0          0         0        1           1         1       0         0  
3    1          1         0        1           1         1       1         0  
;
run;
data want;
 set x;
 length onset  last_age $ 40;
 num_of_groups=0;
 array _a{*} age_: ;
 do i=1 to dim(_a);  
  if _a{i}=1 then last_age=vname(_a{i});
 end;
 do i=dim(_a) to 1 by -1;
  if _a{i}=1 then onset=vname(_a{i}); 
 end;

 num_of_groups=countc(strip(compbl(translate(cats('*',of _a{*},'*'),' ','1'))),' ') ;

 num_of_years=sum(of _a{*});
 drop i;
run;

Ksharp

Keith
Obsidian | Level 7

Modifying @Ksharp's code slightly, you could calculate all the required variables within a single loop of the array.

data want;

set x;

length onset  last_age $ 40;

num_of_groups=0;

num_of_years=0;

array _a{*} age_: ;

do i=1 to dim(_a);

  num_of_years+_a{i};

  if _a{i}=1 then do;

  if num_of_years=1 then onset=vname(_a{i});

  last_age=vname(_a{i});

  if i=1 or _a{i-1}=0 then num_of_groups+1;

  end;

end;

drop i;

run;

Haikuo
Onyx | Level 15

data have;

input ID:$  age_0  age_1  age_2  age_3  age_4  age_5  age_6  age_7;

cards;

1    1          1         1        1           0         0       0         0       

2    0          0         0        1           1         1       0         0       

3    1          1         0        1           1         1       1         0       

;

data want;

  set have;

    array age age_:;

    _si=dim(age);

    _ei=1;

        do _i=1 to  dim(age);

           if age(_i)=1 then do;

            _year=vname(age(_i));

                if _i<=_si then do;onset=_year;_si=min(_si,_i);end;

            if _i>=_ei then do; last=_year;_ei=max(_ei,_i);end;

             end;

        end;

       _cat=cats(of age(*));

       put _cat=;

       Duration_group=countw(translate(_cat,' ','0'));

       Duration_year=countc(_cat,'1');

       drop _:;

run;

proc print;run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 924 views
  • 5 likes
  • 5 in conversation