DATA Step, Macro, Functions and more

why would that not work?

Reply
N/A
Posts: 0

why would that not work?

why can't the macro variable &lefty in this macro be resolved???



%macro info_gain_attrib(attrib);

proc sort data=project.merged_pd_data (keep=&attrib status) out=project.sort_by_&attrib;
by &attrib;
run;

proc sql;
select distinct trim(&attrib)||"'n" into :&attrib._1 - :&attrib._4
from project.sort_by_&attrib;
quit;

%let lefty=%str(%');

%global info_gain_&attrib;

data test;

set project.sort_by_&attrib end=last;
by &attrib;

if first.&attrib then do;
class_N+1; N=1; count_healthy=0; count_parkinson=0;if status=0 then count_healthy+1;
else if status=1 then count_parkinson+1;end;

else if last.&attrib then do;
N+1;
if status=0 then count_healthy+1;
else if status=1 then count_parkinson+1;
p_healthy=count_healthy/N;
p_parkinson=count_parkinson/N;
if class_N=1 then do;
&lefty.info_&&&attrib._1= -p_healthy*log(p_healthy)/log(2)-p_parkinson*log(p_parkinson)/log(2); &lefty.ratio_&&&attrib._1=N/195;
end;
else if class_N=2 then do;
&lefty.info_&&&attrib._2= -p_healthy*log(p_healthy)/log(2)-p_parkinson*log(p_parkinson)/log(2); &lefty.ratio_&&&attrib._2=N/195;
end;
else if class_N=3 then do;
&lefty.info_&&&attrib._3= -p_healthy*log(p_healthy)/log(2)-p_parkinson*log(p_parkinson)/log(2); &lefty.ratio_&&&attrib._3=N/195;
end;
else if class_N=4 then do;
&lefty.info_&&&attrib._4= -p_healthy*log(p_healthy)/log(2)-p_parkinson*log(p_parkinson)/log(2); &lefty.ratio_&&&attrib._4=N/195;
end;
end;

else do;
N+1;
if status=0 then count_healthy+1;
else if status=1 then count_parkinson+1;
end;

if last then do;
info_&attrib=&lefty.ratio_&&&attrib._1*&lefty.info_&&&attrib._1+&lefty.ratio_&&&attrib._2*&lefty.info_&&&attrib._2+&lefty.ratio_&&&attrib._3*&lefty.info_&&&attrib._3+&lefty.ratio_&&&attrib._4*&lefty.info_&&&attrib._4;
info_gain_&attrib=&info_D - info_&attrib; call symput ('info_&attrib', info_&attrib); call symput ('info_gain_&attrib', info_gain_&attrib);
end;
run;

%put The information gain for attribute &attrib is "&&info_gain_&attrib";
%mend;
PROC Star
Posts: 1,561

Re: why would that not work?

It'd be good that you post code that anyone can run.

For eg, replace your data with sashelp samples.
Super Contributor
Super Contributor
Posts: 3,174

Re: why would that not work?

If you have an error, post the SAS log and the specific problem you are experience -- otherwise it's a guessing game because others on the forum have no idea what your data or your SAS environment generate. Reply to your initial post and provide COPY/PASTE from your SAS log with expanded code and diagnostic messages.

Scott Barry
SBBWorks, Inc.
Frequent Contributor
Posts: 102

Re: why would that not work?

I agree with the others comments,

but did notice the lefty macro variable only contains a single quote, yet you are placing one at the begining of two lines that look like they are suppose to generate variable assignment clauses, but in fact creates a line of code that contains nothing but a quoted string all by itself followed by valid assignment statement, but not the one I think you want.

What are you trying to accomplish with that macro variable?

Have you tried OPTIONS=MPRINT to see what code the macro is generating?

Curtis
N/A
Posts: 0

Re: why would that not work?

Sorry for not providing the sample data to test. Actually it is a macro I designed for caculating information gain for a discretized continuous variable. I open the option ----validvarname=any, so that I can use for example '1.423287 to 1.743867'n as valid variable name. That's all. I just want to put the left " ' " to the left of " 1.423287 to 1.743867'n ".

The following scripts may help build the test data set, use it as the data 'sort_by_&attrib'.

data test;
input col1 $ 1-20 status 25;
datalines;
1.423287 to 1.743867 0
1.423287 to 1.743867 1
1.423287 to 1.743867 0
1.423287 to 1.743867 0
1.765957 to 2.330716 1
1.765957 to 2.330716 0
1.765957 to 2.330716 1
1.765957 to 2.330716 0
2.33218 to 2.88245 1
2.33218 to 2.88245 1
2.33218 to 2.88245 1
2.33218 to 2.88245 0
2.8923 to 3.671155 0
2.8923 to 3.671155 1
2.8923 to 3.671155 1
2.8923 to 3.671155 1
;
Frequent Contributor
Posts: 102

Re: why would that not work?

It looks to me like you are writting opening the single quote around your variable names, but are forgetting to close them with the 'n.
N/A
Posts: 0

Re: why would that not work?

Here is the part of log error.


MLOGIC(INFO_GAIN_ATTRIB): %LET (variable name is LEFTY)
MLOGIC(INFO_GAIN_ATTRIB): %GLOBAL INFO_GAIN_&ATTRIB
MPRINT(INFO_GAIN_ATTRIB): data test;
MPRINT(INFO_GAIN_ATTRIB): set project.sort_by_D2_class end=last;
MPRINT(INFO_GAIN_ATTRIB): by D2_class;
MPRINT(INFO_GAIN_ATTRIB): if first.D2_class then do;
MPRINT(INFO_GAIN_ATTRIB): class_N+1;
MPRINT(INFO_GAIN_ATTRIB): N=1;
MPRINT(INFO_GAIN_ATTRIB): count_healthy=0;
MPRINT(INFO_GAIN_ATTRIB): count_parkinson=0;
MPRINT(INFO_GAIN_ATTRIB): if status=0 then count_healthy+1;
MPRINT(INFO_GAIN_ATTRIB): else if status=1 then count_parkinson+1;
MPRINT(INFO_GAIN_ATTRIB): end;
4 The SAS System 11:19 Wednesday, October 28, 2009

MPRINT(INFO_GAIN_ATTRIB): else if last.D2_class then do;
MPRINT(INFO_GAIN_ATTRIB): N+1;
MPRINT(INFO_GAIN_ATTRIB): if status=0 then count_healthy+1;
MPRINT(INFO_GAIN_ATTRIB): else if status=1 then count_parkinson+1;
MPRINT(INFO_GAIN_ATTRIB): p_healthy=count_healthy/N;
MPRINT(INFO_GAIN_ATTRIB): p_parkinson=count_parkinson/N;
MPRINT(INFO_GAIN_ATTRIB): if class_N=1 then do;

_
180
WARNING: The quoted string currently being processed has become more than 262 characters long.
You may have unbalanced quotation marks.
NOTE: Line generated by the macro variable "D2_CLASS_1".
88 info_1.423287 to 1.743867'n
__
49
ERROR 180-322: Statement is not valid or it is used out of proper order.

NOTE 49-169: The meaning of an identifier after a quoted string may change in a future SAS
release. Inserting white space between a quoted string and the succeeding
identifier is recommended.

NOTE: Line generated by the invoked macro "INFO_GAIN_ATTRIB".
88 ('info_&attrib', info_&attrib); call symput ('info_gain_&attrib', info_gain_&attrib)
________________________________
49
88 ! ; end; run;
MLOGIC(INFO_GAIN_ATTRIB): %PUT The information gain for attribute &attrib is
"&&info_gain_&attrib"
The information gain for attribute &attrib is "&&info_gain_&attrib"
MPRINT(INFO_GAIN_ATTRIB): 'info_1.423287 to 1.743867'n=
-p_healthy*log(p_healthy)/log(2)-p_parkinson*log(p_parkinson)/log(2);
&lefty.ratio_&&&attrib._1=N/195; end; else if class_N=2 then do;
&lefty.info_&&&attrib._2=
-p_healthy*log(p_healthy)/log(2)-p_parkinson*log(p_parkinson)/log(2);
&lefty.ratio_&&&attrib._2=N/195; end; else if class_N=3 then do;
&lefty.info_&&&attrib._3=
-p_healthy*log(p_healthy)/log(2)-p_parkinson*log(p_parkinson)/log(2);
&lefty.ratio_&&&attrib._3=N/195; end; else if class_N=4 then do;
&lefty.info_&&&attrib._4=
-p_healthy*log(p_healthy)/log(2)-p_parkinson*log(p_parkinson)/log(2);
&lefty.ratio_&&&attrib._4=N/195; end; end; else do; N+1;
if status=0 then count_healthy+1; else if status=1 then count_parkinson+1;
end; if last then do;
info_&attrib=&lefty.ratio_&&&attrib._1*&lefty.info_&&&attrib._1+&lefty.ratio_&&&attrib._2*&lefty
.info_&&&attrib._2+&lefty.ratio_&&&attrib._3*&lefty.info_&&&attrib._3+&lefty.ratio_&&&attrib._4*
&lefty.info_&&&attrib._4; info_gain_&attrib=&info_D - info_&attrib; call symput
('info_D2_class', info_&attrib); call symput ('info_gain_D2_class
MLOGIC(INFO_GAIN_ATTRIB): Ending execution.
N/A
Posts: 0

Re: why would that not work?

Mars:

You have run into a tokenization problem.
If you run a scanning electon microscope over the documentation you will discover that %str() quotes its arguement at compile time. When the macro-generated text is fed to the SAS tokenizer it is supposed to be unquoted.
The exact rules for when and whether this unquoting occurs are AFAIK undocumented and a trifle erratic.
The result is that instead of one token, a quoted string like 'Hello, world' the tokenizer sees four: ', Hello, world, and '. None of these are valid SAS statements. Hence the 180 errors.
The %unquote() function is the solution for this. It is possible that changing the initial assignment to:
%let lefty = %unquote(%str(%'));
will do it. If that does not work then use:

%unquote(&lefty.info_&&&attrib._1= -p_healthy*log(p_healthy)/log(2)-p_parkinson*log(p_parkinson)/log(2); &lefty.ratio_&&&attrib._1=N/195Smiley Wink

and so on, or enclose the whole section of text in a single %unquote() function.
PROC Star
Posts: 1,561

Re: why would that not work?

OldTimer is right. Unquoting is sometimes hit and miss.

The only way I could make this work is by using

proc sql noprint;
select distinct trim(&attrib) into :&attrib._1 - :&attrib._4 from test;
quit;

and then

%unquote(&lefty.ratio_&&&attrib._1&lefty.n)=N/195;

I am unsure why you'd want to carry these hideous names in the data step instead of calling the new variables INFO1-4 and RATIO1-4 though.

This would also allow you to write
RATIO[CLASS_N]=N/195;
instead of doing successive tests.
Ask a Question
Discussion stats
  • 8 replies
  • 198 views
  • 0 likes
  • 4 in conversation