I'm hoping someone can help me debug this code (in SAS 9.3).
I want to do a series of correlations between pairs of variables (ctfr1915-ctfr1974 and ratesemf_1992-ratesemf_2014) for multiple years using a third variable (mage1915-mage1974) to make the link. For example, mage1970=29 so I want to correlate ctfr1970 and ratesemf_1999 (1970+29=1999).
I've been using the code below and it is obviously getting hung up on the 5th line because ma is a character. How can I pass the numeric value so that it will put whatever the value of ma1965 is into the code and give me in the index year for the second variable (ratesemf_)? I've also pasted the log output below.
Thank you!
options SYMBOLGEN MPRINT;
%macro cohhous(fert, house,sy, ey);
%do yf=&sy. %to &ey.;
%let ma=mage&sy.;
%let yh=%eval(ma+&sy.);
proc corr data = temp1 outp=out&yf. ;
var &fert.&yf. &house.&yh.;
data outb&yf.;
set out&yf.;
corr=&fert.&yf.;
year=&yf.;
keep year corr ;
if _n_=5;
run;
%end;
%mend;
run;
%cohhous (ctfr,ratesemf_,1965,1974);
run;
671 options SYMBOLGEN MPRINT;
672 %macro cohhous(fert, house,sy, ey);
673 %do yf=&sy. %to &ey.;
674 %let ma=mage&sy.;
675 %let yh=%eval(ma+&sy.);
676 proc corr data = temp1 outp=out&yf. ;
677 var &fert.&yf. &house.&yh.;
678 data outb&yf.;
679 set out&yf.;
680 corr=&fert.&yf.;
681 year=&yf.;
682 keep year corr ;
683 if _n_=5;
684 run;
685 %end;
686 %mend;
687 run;
688
689 %cohhous (ctfr,ratesemf_,1965,1974);
SYMBOLGEN: Macro variable SY resolves to 1965
SYMBOLGEN: Macro variable EY resolves to 1974
SYMBOLGEN: Macro variable SY resolves to 1965
SYMBOLGEN: Macro variable SY resolves to 1965
ERROR: A character operand was found in the %EVAL function or %IF condition where a numeric
operand is required. The condition was: ma+1965
ERROR: The macro COHHOUS will stop executing.
690 run;
It pains me to see folks generating macro loops when simple programming steps are easier to program and debug. And often faster. If I understand your task correctly, that's what I think is happening to you.
In a data step (data NEED), you can use array declarations, where the uppoer and lower bounds of the arrays are 1915:1974 and 1992:2014, as needed. This program reads dataset HAVE, takes your list of CTFR variables (ctrf1915-ctrf1974), uses the corresponding MAGE value to pick out the needed RATES variable, and makes two new variables: CTRF_VAL and RATESEMF_VAL, for each iteration from 1915 through 1974. So dataset NEED will have 60 records with two variables generated from 1 record in dataset HAVE with 120 variables (ctrf1914-ctrf1974 and ratesemf_1992-ratesemf_2014). NEED also has the years for CTRF and RATESEMF and also the MAGE value - identifying which variable pair is produced.
Then sort NEED by the CTRF year and RATESEMF year, and do a proc corr with a "by ctrf_year ratesemf_year".
The "trick" here is to define arrays such that the first element is element number 1915 and the last is number 1974 for CTRF and MAGE. And for the RATES array the lower and upper bounds are 1992 and 2014.
If I understand your problem correctly, this program should do what you need:
Correction: the statement
ratesemf_val=rat{mage_val};
should be
ratesemf_val=rat{ratesemf_year};
data need (keep=ctrf_val ctrf_year ratesemf_val ratesemf_year mage_val);
set have;
array ctf {1915:1974} ctrf1915-ctfr1974; /* CTFR vars to correlate */
array rat {1992:2014} ratesemf_1992-ratesemf_2014; /* RATE vars to search for the correlated value*/
array mag {1915:1974} mage1915-mage1974; /* Offset to find the needed RATE var */
do ctrfyear=lbound(ctf) to hbound(ctf);
ctrf_val=ctf{ctrfyear};
mage_val=mag{ctrfyear};
ratesemf_year=ctrfyear+mage_val;
ratesemf_val=rat{mage_val};
output;
end;
run;
proc sort data=need;
by ctrf_year ratesemf_year;
run;
proc corr data=need noprint out=ctf_rat_corrs_by_offset;
by ctrf_year ratesemf_year;
var ctrf_val;
with ratesemf_val;
run;
It's not clear where the values for &YH are supposed to come from. If they are in a SAS data set, you need to replace the %LET statement with a DATA step. Within a DATA step, CALL SYMPUTX allows you to create a macro variable by copying over the values found within a SAS data set.
If the value for &YH is not supposed to come from a SAS data set, then what would be the proper value?
I guess there are (at least) two problems with the code then.
&yh is what I'm hoping to generate by adding the value of mage to the suffix of ctfr. So in the case when I have the variables ctfr1970 and mage1970 which has the value 29 I want to call ratesemf_1999 I was hoping to create a variable, yh, that was equal to 1970+29. Does that make sense? (I hadn't even realized this error in the code because I couldn't get yh to calculate).
Where does the "value of mage" come from?
mage1915-mage1974 are all in the dataset.
%macro cohhous(fert, house,sy, ey);
%do yf=&sy. %to &ey.;
%let ma=mage&sy.;
%let yh=%eval(ma+&sy.); /* This statement happens outside of any data or proc step! */
proc corr data = temp1 outp=out&yf. ;
var &fert.&yf. &house.&yh.;
data outb&yf.;
set out&yf.;
corr=&fert.&yf.;
year=&yf.;
keep year corr ;
if _n_=5;
run;
%end;
%mend;
run;
As I told you before, you CANNOT access data step variables in macro language.
I suspect that you may be better off looking at listing all of variable as VAR and all the others as WITH variables and sending the output to a single data set. Then you will have one set in the values of _name_ and the others in colum headings for the correlations. Then pull out the specific pairs you are concerned with.
With this approach I still need to find a way to get the correct index variable for the second group of variables (e.g. the WITH variables). Any advice or sample code to help with that?
It pains me to see folks generating macro loops when simple programming steps are easier to program and debug. And often faster. If I understand your task correctly, that's what I think is happening to you.
In a data step (data NEED), you can use array declarations, where the uppoer and lower bounds of the arrays are 1915:1974 and 1992:2014, as needed. This program reads dataset HAVE, takes your list of CTFR variables (ctrf1915-ctrf1974), uses the corresponding MAGE value to pick out the needed RATES variable, and makes two new variables: CTRF_VAL and RATESEMF_VAL, for each iteration from 1915 through 1974. So dataset NEED will have 60 records with two variables generated from 1 record in dataset HAVE with 120 variables (ctrf1914-ctrf1974 and ratesemf_1992-ratesemf_2014). NEED also has the years for CTRF and RATESEMF and also the MAGE value - identifying which variable pair is produced.
Then sort NEED by the CTRF year and RATESEMF year, and do a proc corr with a "by ctrf_year ratesemf_year".
The "trick" here is to define arrays such that the first element is element number 1915 and the last is number 1974 for CTRF and MAGE. And for the RATES array the lower and upper bounds are 1992 and 2014.
If I understand your problem correctly, this program should do what you need:
Correction: the statement
ratesemf_val=rat{mage_val};
should be
ratesemf_val=rat{ratesemf_year};
data need (keep=ctrf_val ctrf_year ratesemf_val ratesemf_year mage_val);
set have;
array ctf {1915:1974} ctrf1915-ctfr1974; /* CTFR vars to correlate */
array rat {1992:2014} ratesemf_1992-ratesemf_2014; /* RATE vars to search for the correlated value*/
array mag {1915:1974} mage1915-mage1974; /* Offset to find the needed RATE var */
do ctrfyear=lbound(ctf) to hbound(ctf);
ctrf_val=ctf{ctrfyear};
mage_val=mag{ctrfyear};
ratesemf_year=ctrfyear+mage_val;
ratesemf_val=rat{mage_val};
output;
end;
run;
proc sort data=need;
by ctrf_year ratesemf_year;
run;
proc corr data=need noprint out=ctf_rat_corrs_by_offset;
by ctrf_year ratesemf_year;
var ctrf_val;
with ratesemf_val;
run;
%let ma=mage&sy.;
%let yh=%eval(ma+&sy.);
In the %eval, you use the text ma. I guess you rather wanted to use the macro variable reference &ma instead.
But that doesn't make it any better.
In the %let, you assign the text string (assuming that &sy=1965) mage1965 to ma, which once again is not a numerical value that the %eval can deal with.
Make sure that you reference macro variables correctly, and keep in mind that data step variables are never present when macro triggers are resolved.
Thank you everyone. I had been hoping macro language would have made the task possible without brute-forcing it in a datastep, but it seems the datastep solution offered is the fastest way for me to get this code working.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.