Hi folks,
This might seem nuanced but is it possible to create new variables using parts of or all of an existing variable name in an array kind of style? If I understand correctly you can't change the format of an existing variable from say numeric to character or vice versa. In summary, I had to transpose a character variable that had a lot of distinct values (100+) so I could flag the occurrence and eventually count/sum the frequency. Is there an efficient way of creating another version of the existing 100+ new variables to be flagged and as numeric? I have tried this approach which seems to create the flags and overwrite the existing variable (not creating the supposedly new variable) but the format remains unchanged and it's still a character variable (not surprised as stated earlier that you can't change the format of an existing variable).
data transp_trt_flags;
set test.transp_bc_trt;
/* Flag transposed treatment regimen */
array trt_reg[*] trt_:; /*select all variables beginning with trt_*/
array trt_fl[*] trt_:_fl; /*create new dummy variables to replace the original; gave this a shot without thinking it will run*/
do i = 1 to dim(trt_reg);
if not missing(trt_reg(i)) then
trt_fl(i) = input(1, best32.);
else
trt_fl(i) = input(0, best32.);
end;
output;
run;
Thanks for your help as always
Get the variable names from DICTIONARY.COLUMNS:
proc sql noprint;
select cats(name,"_fl") into :new_array separated by " "
from dictionary.columns
where
libname = "TEST" and memname = "TRANSP_BC_TRT"
and upcase(name) like 'TRT_%'
;
quit;
data transp_trt_flags;
set test.transp_bc_trt;
array trt_reg{*} trt_:;
array trt_fl{*} &new_arr.;
do i = 1 to dim(trt_reg);
trt_fl{i} = missing(trt_reg{i});
end;
run;
You can also see where your code can be greatly simplified.
Without giving it much thought - maybe you could create variable definitions as macro variables prior to your data step. Something like
proc sql;
select dictinct trt into: trt_vars separated by ' '
from have;
quit;
data want;
set have;
length &trt_vars 8.;
....
run;
That seems logical @LinusH I will try it and definitely let you know. I appreciate it
Get the variable names from DICTIONARY.COLUMNS:
proc sql noprint;
select cats(name,"_fl") into :new_array separated by " "
from dictionary.columns
where
libname = "TEST" and memname = "TRANSP_BC_TRT"
and upcase(name) like 'TRT_%'
;
quit;
data transp_trt_flags;
set test.transp_bc_trt;
array trt_reg{*} trt_:;
array trt_fl{*} &new_arr.;
do i = 1 to dim(trt_reg);
trt_fl{i} = missing(trt_reg{i});
end;
run;
You can also see where your code can be greatly simplified.
Thanks, @Kurt_Bremser I will give it a shot. I appreciate your help. Will keep you posted
@Kurt_Bremser thanks for your help. The first bit worked. Just a note for others to correct the typo in the macro variable plus I'm not quite sure of the logic for the second bit since it is returning a bunch of 1s for missing (need it to be the other way around for 1s if not missing) so I used my original code. I lost a few variables too. But all in all, considering the grand scheme of things, this will suffice. I appreciate it
&new_arr. should be &new_array.
others can try trt_fl{i} = not missing(trt_reg{i}); in your code as well. That should work
If your transposed variable (in the VAR statement) is character type, then you might think of changing its format into numeric before transposing;
like.
data want;
set have;
num_var= put(char_var, best.);
run;
proc transpose data= want out=want_wide;
by <byvar>;
var num_var;
id <idvar>;
run;
@A_Kh Yeah, I thought about that but did not try it mainly because I didn't think you could necessarily format a character variable say "boy" or "girl" into numeric. I may be wrong. Will try that as well. This is really piquing my interest now. Could have saved me a lot of time if I had just given it a shot when I thought about it. Thanks
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.