BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Dinurik
Fluorite | Level 6

Hi SAS users,

 

Could you please help me simplify this code? I need to create a vector of 21 variables that take value of 1 if a subject dropped out of the study, and 0 otherwise. I currently use the code below, but it is silly and inefficient. The repetitive pattern suggests that I can use a loop. 

 

data want;
   set have;
      cens_1 = 0;
      if min(of cd4_2-cd4_21) < 0 then cens_2 = 1; /*If all values of cd4 from visit 2 to 21 are missing, thencens_2 takes a value of 1*/
      else cens_2 = 0;
      if min(of cd4_3-cd4_21) < 0 then cens_3 = 1;
      else cens_3 = 0;
      if min(of cd4_4-cd4_21) < 0 then cens_4 = 1;
      else cens_4 = 0;
      if min(of cd4_5-cd4_21) < 0 then cens_5 = 1;
      else cens_5 = 0; etc...
run;

 

I tried a do loop:

 

data want;
   set have;

   array cd4(21) cd4_1-cd4_21;

   arrray cens(21) cens_1-cens_21;

   do i = 1 to 21;

   if min(of cd4(i) - cd4_21) < 0  then cens(i) = 1;

   else cens(i) = 0;

end;

run;

 

The problem is in the expression (of cd4(i) - cd4_21). SAS says: ERROR: Missing numeric suffix on a numbered variable list (NAME-cd4_21).

 

I would appreciate any help!

 

Thanks!

   

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

See if this does what you are looking for. I only used 5 values to make smaller example that could actually be manually inspected.

 

data junk;
   array cd4(5) ;
   do i=1 to 5;
      cd4[i]= rand('uniform');
      output;
   end;
run;

data want;
   set junk;
   array cd4(5);
   array cens(5);
   array t(5) _temporary_;
   
   do i= 1 to 5;
      call missing(of t(*));
      do j= i to 5;
         t[j]= cd4[j];
      end;
      cens[i] = (min(of t(*)) < 0);
   end;
   drop i j;
run;

You have discovered one of the odd things of SAS MIN, MAX and related functions when working with arrays, the OF arrayref doesn't like to mixed with anything much less potential operators like -.

 

The above code copies the CD values into a temporary array which allows using the (of array(*)) to inspect all values of the array.

CALL MISSING is used to reset the array. The temporary array is NOT written out to the data set. Care needs to be used with temporary arrays as the values will by default persist as if RETAINED unless reset.

 

If you are searching for missing values though I would suggest using

cens[I] = missing( min(of t[I])) instead of "< 0" just in case.

Note that

(min(of t(*)) < 0)

is a logical comparison that returns 1 when true and 0 when false. Which means you get rid of a bunch of all the If/then/else.

View solution in original post

9 REPLIES 9
Reeza
Super User
It's not clear which values you're trying to set to 1, all of the ones before of a time point, after a timepoint? at a timepoint?
Dinurik
Fluorite | Level 6

I am trying to get the variable cens(t) take a  value of 1 at visit  t  if the value of cd4(t)  was missing at this visit and onward, and if cens(t)=1 then all subsequent values of cens(t) should be 1. This is what my current code is doing. But the problem is, I have to write the same two lines 20 times: 

 

if min(of cd4_2-cd4_21) < 0 then cens_2 = 1;
else cens_2 = 0; .......

 

I thought that a do loop would be a better choice, but cannot get mine working. 

Reeza
Super User
Can you simplify this to 5 cd and 10 obs and show what the input data looks like and the output data looks like?
Dinurik
Fluorite | Level 6

data example;
input cd4_1 cd4_2 cd4_3 cd4_4 cd4_5 ;
cards ;
340 . 440 320 278
1750 1600 1500 1600 1800
334 363 . 507 502
620 720 550 660 .
573 590 . . .
800 860 . . 900
390 . . . .
806 622 522 . .
1324 1060 1140 750 1500
700 730 830 880 750 220
;
run;

 

proc print data=example; run;

 

 
Astounding
PROC Star

A more efficient approach would minimize looping and computations.  Consider:

 

data want;
   set have;
   cens_1=0;
   if min(of cd4_5 - cd4_21) < 0 then cens_5=1;
   else cens_5=0;
   if cens_5 = 1 or cd4_4 < 0 then cens_4=1;
   else cens_4=0;
   if cens4 = 1 or cd4_3 < 0 then cens_3=1;
   else cens_3=0;
   if cens3 = 1 or cd4_2 < 0 then cens_2=1;
   else cens_2=0;
run;

I can't vouch for the accuracy of the logic, but these statements should replicate the values that your current program generates.  If the logic looks right, and if you really need 21 CENS_ values computed, we can look at using arrays to shorten the coding burden.

ballardw
Super User

See if this does what you are looking for. I only used 5 values to make smaller example that could actually be manually inspected.

 

data junk;
   array cd4(5) ;
   do i=1 to 5;
      cd4[i]= rand('uniform');
      output;
   end;
run;

data want;
   set junk;
   array cd4(5);
   array cens(5);
   array t(5) _temporary_;
   
   do i= 1 to 5;
      call missing(of t(*));
      do j= i to 5;
         t[j]= cd4[j];
      end;
      cens[i] = (min(of t(*)) < 0);
   end;
   drop i j;
run;

You have discovered one of the odd things of SAS MIN, MAX and related functions when working with arrays, the OF arrayref doesn't like to mixed with anything much less potential operators like -.

 

The above code copies the CD values into a temporary array which allows using the (of array(*)) to inspect all values of the array.

CALL MISSING is used to reset the array. The temporary array is NOT written out to the data set. Care needs to be used with temporary arrays as the values will by default persist as if RETAINED unless reset.

 

If you are searching for missing values though I would suggest using

cens[I] = missing( min(of t[I])) instead of "< 0" just in case.

Note that

(min(of t(*)) < 0)

is a logical comparison that returns 1 when true and 0 when false. Which means you get rid of a bunch of all the If/then/else.

Dinurik
Fluorite | Level 6

Thank you! 

I just tried your code, and it produces only zero values in cens_1-cens_21 variables. 

ballardw
Super User

@Dinurik wrote:

Thank you! 

I just tried your code, and it produces only zero values in cens_1-cens_21 variables. 


When I run my code on my data the result looks like:

Obs     cd41      cd42      cd43      cd44      cd45    cens1   cens2   cens3   cens4   cens5

 1    0.16056    .         .         .         .          0       1       1       1       1
 2    0.16056   0.38588    .         .         .          0       0       1       1       1
 3    0.16056   0.38588   0.24446    .         .          0       0       0       1       1
 4    0.16056   0.38588   0.24446   0.27629    .          0       0       0       0       1
 5    0.16056   0.38588   0.24446   0.27629   0.14308     0       0       0       0       0




Which shows 1 when the corresponding values are all missing.

So perhaps your data is not as described or you haven't completely described your problem.

Should also show the actual code that you ran from the log. Copy  the code and any messages from the log and paste into a code box.

 

I changed some variable names because I don't like that many _ characters. Did you forget to change my cd4 to cd4_ or cens_????

Dinurik
Fluorite | Level 6

Yes, sorry. I made a mistake in your code before - that's why it didn't work. 

 

It works perfectly now. Thank you so much! This is such a better solution that writing conditions for each of 21 variables!

 

 

data want;
set have;
array cd4_(21);
array cens(21);
array t(21) _temporary_;

do i= 1 to 21;
call missing(of t(*));
do j= i to 21;
t[j]= cd4_[j];
end;
cens[i] = (min(of t(*)) < 0);
end;
drop i j;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 931 views
  • 0 likes
  • 4 in conversation