Hi everyone,
I a beginner with SAS and have a question. I am trying to create an array (CHF1-CHF-23) for each patient in a dataset that has members as large as the longest years of follow-up (23 years). I can create the array using the following code:
proc sort data = merged2; by descending CHF_FU; run; data _NULL_; set merged2; if _N_=1 then call symput('cols',CHF_FU); run; %put Additional Columns: &cols; proc sort data = merged2; by ID; run; data merged2; set merged2; array CHF[&cols]; i = 1; do until (i > CHF_FU); CHF[i] = 0; i + 1; end;
However, now for select patients (UNCHF=1) I need to replace the last member of that array with a 1 instead of a 0. I don't know how to address the last member of the array for that subsample and replace it with a specific value.
Thanks!
So first let's convert your sample data into a dataset. Let's use UNCHF for the second variable to match what you used before and avoid conflict with the array name that you used before.
data have ;
input ID UNCHF CHF_FU;
cards;
1 0 23
2 1 14
3 0 8
4 0 5
;
Then let's find the maximum values of CHF_FU. Here is another method.
data _null_;
retain max 0;
if eof then call symputx('cols',max);
set have end=eof;
max=max(max,chf_fu);
run;
If you use a normal iterative DO loop the code will be easier to write and easier to understand.
data want ;
set have ;
array CHF (&cols);
do i=1 to chf_fu ;
chf(i)=0;
end;
if unchf then chf(chf_fu)=1;
drop i;
run;
Here is the result:
6249 data _null_; 6250 set want ; 6251 put (id unchf chf_fu) (3.) (chf1-chf23) (2.); 6252 run; 1 0 23 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 14 0 0 0 0 0 0 0 0 0 0 0 0 0 1 . . . . . . . . . 3 0 8 0 0 0 0 0 0 0 0 . . . . . . . . . . . . . . . 4 0 5 0 0 0 0 0 . . . . . . . . . . . . . . . . . . NOTE: There were 4 observations read from the data set WORK.WANT
The only way that it could give you index out of range errors would be if the value of CHF_FU is less than 1 or is not an integer. So perhaps you could test for that in the code?
data want ;
set have ;
array CHF (&cols);
if chf_fu < 1 or chf_fu ne int(chf_fu) then put 'ERROR: Invalid value. ' chf_fu=;
else do;
do i=1 to chf_fu ;
chf(i)=0;
end;
if unchf then chf(chf_fu)=1;
end;
drop i;
run;
Can you provide examples of the data you have and the data you want (both in the form of data steps)? That would make it a lot easier to answer your question.
Art, CEO, AnalystFinder.com
First let's simplify the top of your program. If you want to put the maximum value of CHF_FU into a macro variable it is much easier to just use PROC SQL.
proc sql noprint ;
select max(chf_fu) into :cols trimmed
from merged2
;
quit;
%put Additional Columns: &cols;
Now in your main program if you want to iterate over a series of integers the DO loop can do that directly. No need to manual set/increment your own counter.
data new_merged2;
set merged2;
array CHF(&cols);
do i=1 to CHF_FU-1 ;
CHF(i) = 0;
end;
CHF(CHF_FU)=1;
run;
Thank you for your answer. I tried the following code since I don't want every patient to have a 1 at the end. Only the subset that has the event of interest:
data merged2; set merged2;
array CHF[&cols];
i = 1;
do until (i > CHF_FU);
CHF[i] = 0;
i + 1;
end;
if UNCHF=1 then do j= CHF_FU - 1;
CHF[j]=1;
end;
drop i j;
run;
However, this time I get the error message 'array subscript out of range'. Do you know where my mistake is? Thanks
@Sina wrote:
Thank you for your answer. I tried the following code since I don't want every patient to have a 1 at the end. Only the subset that has the event of interest:
data merged2; set merged2; array CHF[&cols]; i = 1; do until (i > CHF_FU); CHF[i] = 0; i + 1; end; if UNCHF=1 then do j= CHF_FU - 1; CHF[j]=1; end; drop i j; run;
However, this time I get the error message 'array subscript out of range'. Do you know where my mistake is? Thanks
The error will actually tell you where I believe.
It helps to post your log in these cases.
I suspect this line causes the issue, but without data/log it's just a guess.
do until (i > CHF_FU);
This line likely generates the error:
CHF[j]=1;
I suspect there's an easier way to do this and if you posted sample data it would help.
This is the sample data:
ID CHF CHF_FU
1 0 23
2 1 14
3 0 8
4 0 5
ID is patient ID. CHF is whether the patient has the disease, and CHF_FU indicates years of follow-up in the study. I want to create an array of 23 (Max of the years of follow-up) that has 23, 8, and 5 zeros for patients 1, 3, and 4, respectively. For the second patient, I want the array to have 13 zeros and in the fourteenth column a 1:
ID CHF1 CHF2 CHF3 CHF4 CHF5 CHF6 CHF7 CHF8 CHF9 CHF10 CHF11 CHF12 CHF13 CHF14 ... CHF22 CHF23
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 0 0 0 1 . .
.
.
I suggest you start with that next time.
The solution is as proposed by @Tom initially, with a second IF to assign 1 if the CHF=1.
data have;
input ID CHF CHF_FU;
cards;
1 0 23
2 1 14
3 0 8
4 0 5
;
run;
data want;
set have;
array _chf_(*) chf1-chf23;
do i=1 to chf_fu-1;
_chf_(i)=0;
end;
if chf=1 then
_chf_(chf_fu)=1;
else
_chf_(chf_fu)=0;
run;
So first let's convert your sample data into a dataset. Let's use UNCHF for the second variable to match what you used before and avoid conflict with the array name that you used before.
data have ;
input ID UNCHF CHF_FU;
cards;
1 0 23
2 1 14
3 0 8
4 0 5
;
Then let's find the maximum values of CHF_FU. Here is another method.
data _null_;
retain max 0;
if eof then call symputx('cols',max);
set have end=eof;
max=max(max,chf_fu);
run;
If you use a normal iterative DO loop the code will be easier to write and easier to understand.
data want ;
set have ;
array CHF (&cols);
do i=1 to chf_fu ;
chf(i)=0;
end;
if unchf then chf(chf_fu)=1;
drop i;
run;
Here is the result:
6249 data _null_; 6250 set want ; 6251 put (id unchf chf_fu) (3.) (chf1-chf23) (2.); 6252 run; 1 0 23 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 14 0 0 0 0 0 0 0 0 0 0 0 0 0 1 . . . . . . . . . 3 0 8 0 0 0 0 0 0 0 0 . . . . . . . . . . . . . . . 4 0 5 0 0 0 0 0 . . . . . . . . . . . . . . . . . . NOTE: There were 4 observations read from the data set WORK.WANT
The only way that it could give you index out of range errors would be if the value of CHF_FU is less than 1 or is not an integer. So perhaps you could test for that in the code?
data want ;
set have ;
array CHF (&cols);
if chf_fu < 1 or chf_fu ne int(chf_fu) then put 'ERROR: Invalid value. ' chf_fu=;
else do;
do i=1 to chf_fu ;
chf(i)=0;
end;
if unchf then chf(chf_fu)=1;
end;
drop i;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.