Solved: Replacing the last member of an array with a specific value

Sina · Posted 06-04-2017 11:53 AM

Hi everyone,

I a beginner with SAS and have a question. I am trying to create an array (CHF1-CHF-23) for each patient in a dataset that has members as large as the longest years of follow-up (23 years). I can create the array using the following code:

proc sort data = merged2;
   by descending CHF_FU;
run;

data _NULL_;
   set merged2;
   if _N_=1 then call symput('cols',CHF_FU);
run;

%put Additional Columns: &cols;

proc sort data = merged2;   
   by ID;
run;
data merged2; set merged2;

   array CHF[&cols];
   
   i = 1;
   do until (i > CHF_FU);
      CHF[i] = 0;
      i + 1;
   end;

However, now for select patients (UNCHF=1) I need to replace the last member of that array with a 1 instead of a 0. I don't know how to address the last member of the array for that subsample and replace it with a specific value.

Thanks!

Tom · Posted 06-04-2017 06:49 PM

So first let's convert your sample data into a dataset. Let's use UNCHF for the second variable to match what you used before and avoid conflict with the array name that you used before.

data have ;
 input ID UNCHF CHF_FU;
cards;
1 0 23 
2 1 14
3 0  8
4 0  5
;

Then let's find the maximum values of CHF_FU. Here is another method.

data _null_;
  retain max 0;
  if eof then call symputx('cols',max);
  set have end=eof;
  max=max(max,chf_fu);
run;

If you use a normal iterative DO loop the code will be easier to write and easier to understand.

data want ;
  set have ;
  array CHF (&cols);
  do i=1 to chf_fu ;
    chf(i)=0;
  end;
  if unchf then chf(chf_fu)=1;
  drop i;
run;

Here is the result:

6249  data _null_;
6250    set want ;
6251    put (id unchf chf_fu) (3.) (chf1-chf23) (2.);
6252  run;

  1  0 23 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  2  1 14 0 0 0 0 0 0 0 0 0 0 0 0 0 1 . . . . . . . . .
  3  0  8 0 0 0 0 0 0 0 0 . . . . . . . . . . . . . . .
  4  0  5 0 0 0 0 0 . . . . . . . . . . . . . . . . . .
NOTE: There were 4 observations read from the data set WORK.WANT

The only way that it could give you index out of range errors would be if the value of CHF_FU is less than 1 or is not an integer. So perhaps you could test for that in the code?

data want ;
  set have ;
  array CHF (&cols);
  if chf_fu < 1 or chf_fu ne int(chf_fu) then put 'ERROR: Invalid value. ' chf_fu=;
  else do;
    do i=1 to chf_fu ;
      chf(i)=0;
    end;
    if unchf then chf(chf_fu)=1;
  end;
  drop i;
run;

View solution in original post

art297 · Posted 06-04-2017 12:26 PM

Can you provide examples of the data you have and the data you want (both in the form of data steps)? That would make it a lot easier to answer your question.

Art, CEO, AnalystFinder.com

Tom · Posted 06-04-2017 12:27 PM

First let's simplify the top of your program. If you want to put the maximum value of CHF_FU into a macro variable it is much easier to just use PROC SQL.

proc sql noprint ;
  select max(chf_fu) into :cols trimmed
  from merged2
  ;
quit;
%put Additional Columns: &cols;

Now in your main program if you want to iterate over a series of integers the DO loop can do that directly. No need to manual set/increment your own counter.

data new_merged2;
  set merged2;
  array CHF(&cols);
  do i=1 to CHF_FU-1 ;  
     CHF(i) = 0;
  end;
  CHF(CHF_FU)=1;
run;

Sina · Posted 06-04-2017 06:01 PM

Thank you for your answer. I tried the following code since I don't want every patient to have a 1 at the end. Only the subset that has the event of interest:

data merged2; set merged2;

   array CHF[&cols];
   
   i = 1;
   do until (i > CHF_FU);
      CHF[i] = 0;
      i + 1;
   end;
   if UNCHF=1 then do j= CHF_FU - 1;
   CHF[j]=1;
   end;
   drop i j;
run;

However, this time I get the error message 'array subscript out of range'. Do you know where my mistake is? Thanks

Reeza · Posted 06-04-2017 06:20 PM

@Sina wrote:

Thank you for your answer. I tried the following code since I don't want every patient to have a 1 at the end. Only the subset that has the event of interest:
data merged2; set merged2;

   array CHF[&cols];
   
   i = 1;
   do until (i > CHF_FU);
      CHF[i] = 0;
      i + 1;
   end;
   if UNCHF=1 then do j= CHF_FU - 1;
   CHF[j]=1;
   end;
   drop i j;
run;
However, this time I get the error message 'array subscript out of range'. Do you know where my mistake is? Thanks

The error will actually tell you where I believe.

It helps to post your log in these cases.

I suspect this line causes the issue, but without data/log it's just a guess.

  do until (i > CHF_FU);

This line likely generates the error:

CHF[j]=1;

I suspect there's an easier way to do this and if you posted sample data it would help.

Sina · Posted 06-04-2017 06:29 PM

This is the sample data:

ID CHF CHF_FU

1 0 23

2 1 14

3 0 8

4 0 5

ID is patient ID. CHF is whether the patient has the disease, and CHF_FU indicates years of follow-up in the study. I want to create an array of 23 (Max of the years of follow-up) that has 23, 8, and 5 zeros for patients 1, 3, and 4, respectively. For the second patient, I want the array to have 13 zeros and in the fourteenth column a 1:

ID CHF1 CHF2 CHF3 CHF4 CHF5 CHF6 CHF7 CHF8 CHF9 CHF10 CHF11 CHF12 CHF13 CHF14 ... CHF22 CHF23

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0 0 0 0 0 1 . .

.

Reeza · Posted 06-04-2017 06:43 PM

I suggest you start with that next time.

The solution is as proposed by @Tom initially, with a second IF to assign 1 if the CHF=1.

data have;
	input ID CHF CHF_FU;
	cards;
1        0               23 
2        1               14
3        0               8
4        0               5 
;
run;

data want;
	set have;
	array _chf_(*) chf1-chf23;

	do i=1 to chf_fu-1;
		_chf_(i)=0;
	end;

	if chf=1 then
		_chf_(chf_fu)=1;
	else
		_chf_(chf_fu)=0;
run;

Tom · Posted 06-04-2017 06:49 PM

So first let's convert your sample data into a dataset. Let's use UNCHF for the second variable to match what you used before and avoid conflict with the array name that you used before.

data have ;
 input ID UNCHF CHF_FU;
cards;
1 0 23 
2 1 14
3 0  8
4 0  5
;

Then let's find the maximum values of CHF_FU. Here is another method.

data _null_;
  retain max 0;
  if eof then call symputx('cols',max);
  set have end=eof;
  max=max(max,chf_fu);
run;

If you use a normal iterative DO loop the code will be easier to write and easier to understand.

data want ;
  set have ;
  array CHF (&cols);
  do i=1 to chf_fu ;
    chf(i)=0;
  end;
  if unchf then chf(chf_fu)=1;
  drop i;
run;

Here is the result:

6249  data _null_;
6250    set want ;
6251    put (id unchf chf_fu) (3.) (chf1-chf23) (2.);
6252  run;

  1  0 23 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  2  1 14 0 0 0 0 0 0 0 0 0 0 0 0 0 1 . . . . . . . . .
  3  0  8 0 0 0 0 0 0 0 0 . . . . . . . . . . . . . . .
  4  0  5 0 0 0 0 0 . . . . . . . . . . . . . . . . . .
NOTE: There were 4 observations read from the data set WORK.WANT

The only way that it could give you index out of range errors would be if the value of CHF_FU is less than 1 or is not an integer. So perhaps you could test for that in the code?

data want ;
  set have ;
  array CHF (&cols);
  if chf_fu < 1 or chf_fu ne int(chf_fu) then put 'ERROR: Invalid value. ' chf_fu=;
  else do;
    do i=1 to chf_fu ;
      chf(i)=0;
    end;
    if unchf then chf(chf_fu)=1;
  end;
  drop i;
run;

Replacing the last member of an array with a specific value

Re: Replacing the last member of an array with a specific value

Re: Replacing the last member of an array with a specific value

Re: Replacing the last member of an array with a specific value

Re: Replacing the last member of an array with a specific value

Re: Replacing the last member of an array with a specific value

Re: Replacing the last member of an array with a specific value

Re: Replacing the last member of an array with a specific value

Re: Replacing the last member of an array with a specific value

Ready to join fellow brilliant minds for the SAS Hackathon?

Classroom Training Available!