Hi,
The following code is a simplification of a project I’m currently working on. It will eventually be part of a macro. The hypothetical dataset ‘dat1’ will be different for each run of the macro.
In the step 3), I assign a list of codes to the macro variable ‘codes’. Then in the step 6) I try to use it in the array ‘codelist’, but it’s not working. Any recommendation on how I should better create the macro variable ‘codes’ so that when I pass it in the array it works? In other words, how should I define ‘codelist’ from 3) so that I can run the array in 6) with no issues?
Note: I'm using SAS EG 7.1.
I hope this is clear enough.
Thanks a lot,
A.G.
*1)Creating dataset for analysis ;
data dat1;
input personid code $ value;
datalines;
1 22c 0.2
1 22b 1.3
2 22c 1.2
3 22b 2.4
3 22c 8.2
;
*2) Creating a dataset with unique code values;
data codelist;
set dat1 (keep=code);
proc sort nodupkey;
by code;
run;
*3) Assigning the code list from previous dataset;
%let codes =;
data _null_;
set codelist;
call symputx('codes',trim(resolve('&codes'))||' '||trim(code));
run;
%put &codes;
*4) Assigning a macro variable the number of unique codes;
data _null_;
set codelist end=last;
N = _n_;
if last then call symputx('nocodes', N);
run;
%put &=nocodes;
*5) Transposing dat1 so that each individual has only one
record and each code becomes a variable;
proc transpose data=dat1 out=dat2 (drop=_name_);
by personid;
id code;
run;
*6) Assigning a zero value where the value for ‘code’ is missing;
data dat3;
set dat2;
array codelist {&nocodes} &codes;
do i=1 to &nocodes;
if codelist{i}= . then codelist{i}= 0;
end;
drop i;
run;
You haven't clearly shown what error you are getting, but I would think your main problem is the values of CODE are not valid names for variables. You should probably look at the whole approach to the problem and use a method to solve your actual problem that does not involve using the values of CODE as the names of variables.
You might be able to get your approach to work. One method would be to set option VALIDVARNAME=ANY and use name literals in your macro variable. Try these simplifications to your approach.
Steps 2,3 and 4 can be collapsed into a single SQL query.
proc sql noprint;
select distinct nliteral(code)
into codes separated by ' '
from dat1
order by code
;
%let nocodes=&sqlobs;
quit;
Note you don't actually need to know the number of codes, but you can just get it from the automatic variable SQLOBS.
options validvarname=any;
proc transpose data=dat1 out=dat2 (drop=_name_);
by personid;
id code;
var value;
run;
data dat3;
set dat2;
array codelist &codes;
do index=1 to dim(codelist);
if codelist[index]= . then codelist[index]= 0;
end;
drop index;
run;
Note if you move the ARRAY statement before the SET statement in the last step then the variables will be defined in alphabetical order instead of the order that they first appear in the DAT1 dataset.
You haven't clearly shown what error you are getting, but I would think your main problem is the values of CODE are not valid names for variables. You should probably look at the whole approach to the problem and use a method to solve your actual problem that does not involve using the values of CODE as the names of variables.
You might be able to get your approach to work. One method would be to set option VALIDVARNAME=ANY and use name literals in your macro variable. Try these simplifications to your approach.
Steps 2,3 and 4 can be collapsed into a single SQL query.
proc sql noprint;
select distinct nliteral(code)
into codes separated by ' '
from dat1
order by code
;
%let nocodes=&sqlobs;
quit;
Note you don't actually need to know the number of codes, but you can just get it from the automatic variable SQLOBS.
options validvarname=any;
proc transpose data=dat1 out=dat2 (drop=_name_);
by personid;
id code;
var value;
run;
data dat3;
set dat2;
array codelist &codes;
do index=1 to dim(codelist);
if codelist[index]= . then codelist[index]= 0;
end;
drop index;
run;
Note if you move the ARRAY statement before the SET statement in the last step then the variables will be defined in alphabetical order instead of the order that they first appear in the DAT1 dataset.
Lets start at the end:
WHAT is the output supposed to look like?
Second, when you get errors running your code you should copy from the log the data step or procedure code along with the errors. And I'm pretty sure your code has errors. Paste the copied text into a code box opened on the forum with </> icon to preserve formatting. This is important because the message windows on the forum WILL reformat text and displace diagnostic characters that SAS often places in the log with errors.
Your data step 5 has some issues:
data dat3;
set dat2;
array codelist {&nocodes} &codes; <= what is the purposed of &codes here?
do i=1 to &nocodes;
if codelist{i}= . then codelist{i}= 0;
end;
drop i;
run;
If you want to assign values to array elements then the values go in parentheses and if the values are charter have to be comma delimited and if character quoted AND tell SAS the Array is character and how long the elements are.
Array dummy {3} $ 5 ('abcde', 'fghij', 'klmno');
for example. Dummy1 would be 'abcde', dummy2 'fghij' and dummy3 'klmno'.
None of the variables would be missing.
Also at no time are you examining the values from the transposed data. From your Proc Transpose Code you would have variable
I don't think that your step 2, 3, or 4 is needed.
proc transpose data=dat1 out=dat2 (drop=_name_); by personid; id code; run; data dat3; set dat2; array codes _22:; do i=1 to dim(codes); if codes{i}= . then codes{i}= 0; end; drop i; run;
The _22: uses a list generator to reference all the variables in the data set whose names start with _22.
If you have lots more values with different bases you might be able to use _: as long as no other variables started with the underscore character. Using a list of variables means that you do not need to specify the number of variables in the data set. And the DIM(arrayname) returns the number of defined elements in the array so you do not need to pre-count them.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.