Hello,
I have a SAS dataset with two variables, STATE and FORMULA that looks like this. The values of FORMULA are if/then statements with semicolons.
Arizona if 2>1 then BrandNewVariable = "YES";
Texas if 2>3 then BrandNewVariable = "YES";
California if 3>0 then BrandNewVariable = "YES";
Is it possible to use the FORMULA variable to within a data step (or any other method) to get a data set with three variables, STATE, FORMULA, BRANDNEWVARIABLE?
Arizona if 2>1 then BrandNewVariable = "YES"; YES
Texas if 2>3 then BrandNewVariable = "YES";
California if 3>0 then BrandNewVariable = "YES"; YES
Thank you very much for your help.
Not clear what you expect as result.
A data null step could be used to create another data step, either by using call execute or by writing sas statements to a file that can be included/executed.
Hi,
There are several ways to do this.
Either way you choose, you'll need to dynamically generate the code you would like to run.
You can either choose to directly run the generated code or outsource it to a macro or even outsource it to a new sas file, it's up to you.
For my part I think I would dynamically generate a macro code that I would compile and run on my data.
It would look like this:
DATA have;
length STATE $20 FORMULA $200;
infile datalines TRUNCOVER;
input STATE $20. FORMULA $200.;
Datalines4;
Arizona if 2>1 then BrandNewVariable = "YES";
Texas if 2>3 then BrandNewVariable = "YES";
California if 3>0 then BrandNewVariable = "YES";
;;;;
RUN;
data _NULL_;
set have end=last;
length macroCode $2000;
retain macroCode '';
if _N_ eq 1 then macroCode=cats('%MACRO runSTATECode(state=);');
macroCode=cats(macroCode,'if strip(upcase(state)) eq "',upcase(STATE),'" then do;',formula,'end;');
if last then do;
macroCode=cats(macroCode,'%MEND runSTATECode;');
rc=dosubl(macroCode);*compile macro code;
put macroCode=;
end;
run;
options mprint nomprintnest ;
data want;
set have;/*or any other dataset with testdata*/
LENGTH BrandNewVariable $3;
%runSTATECode();
run;
- Cheers -
If the name of a newly created variable in formula is arbitrary, then you cannot solve this in one step, as the calling step must define all its variables before it starts processing.
You need to run a separate data step for each observation in the source dataset, and concatenate all resulting datasets in the end:
data have;
infile datalines dlm="|";
input state :$20. formula :$50.;
datalines4;
Arizona|if 2>1 then BrandNewVariable = "YES";
Texas|if 2>3 then BrandNewVariable = "YES";
California|if 3>0 then BrandNewVariable = "YES";
;;;;
data _null_;
set have end=done;
call execute(cats(
"data ds",
put(_n_,z10.),
";length state $20;state='",
state,
"';",
formula,
"run;"
));
if done
then do;
call execute("data want;set");
do i = 1 to _n_;
call execute(" ds"!!put(i,z10.));
end;
call execute(";run;");
end;
run;
data have;
infile cards truncover;
input state : $40. formula $100.;
cards4;
Arizona if 2>1 then BrandNewVariable = "YES";
Texas if 2>3 then BrandNewVariable = "YES";
California if 3>0 then BrandNewVariable = "YES";
;;;;
filename x temp;
data _null_;
set have end=last;
file x lrecl=200 ;
if _n_=1 then put 'data want;set have;';
put 'if _n_=' _n_ 'then do;' formula $100. 'end;';
if last then put 'run;';
run;
%include x /source source2 lrecl=200;
Hi guys @Kurt_Bremser @Ksharp
I think the have dataset should be considered as lookup table where the formula for each state is saved.
Therefore you're code will not produce the correct results when running on another dataset like sashelp.prdsal2 for example.
- Cheers -
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.