Hi,
I'm having real problems correctly creating a value using Lag through a macro.Below is the code and some test data.
I want to lag the response_sum field by the value in the field "for_m3". Using line 14 as an example, I want to resolve to a value of lag2 (which should equal 3426). But instead it puts the value 4001. The log shows this:
MPRINT(LAGIT): response_lag3=lag2(response_sum);
So I would reasonably expect the value in response_lag3 to be 3426 - but it's 4001
data test_data;
length month 3. response_sum 8. for_m3 8. ;
infile datalines delimiter=',';
input month response_sum for_m3 ;
datalines;
1,4877,0
2,8388,0
3,8766,0
4,8491,3
5,6953,3
6,4823,3
7,5761,3
8,7638,3
9,5613,3
10,4258,3
11,4001,3
12,3426,3
13,1414,2
14,1347,2
15,1117,0
16,663,3
;
run;
proc sort test_data;
by month;
run;
%macro lagit(lag_num);
%do b=1 %to &obs_for_macro.;
response_lag&lag_num.=lag&&for_m&lag_num._&b.(response_sum);
%end;
if for_m&lag_num.=0 then
do;
response_lag&lag_num.=0;
end;
%mend ;
data _null_;
set test_data;
by month;
call symput("for_m3_"||left(_n_),compress(for_m3));
if last.month then call symput("obs_for_macro",left(_n_));
run;
%macro L3m();
data test_data_lag;
set test_data;
%lagit(3);
run;
%mend l3m;
%L3m();
Thanks in advance for your help.
Macro is for generating code that is then compiled and run. You cannot use it to reference values of dataset variables during run time. Conditionally selecting a value from a list using the value of another variable as an index is why SAS has the ARRAY construct.
data want ;
set ;
array back (3) _temporary_;
back(1) = lag(response_sum);
back(2) = lag2(response_sum);
back(3) = lag3(response_sum);
if 1 <= for_m3 <=3 then lag_response_sum = back(for_m3);
else lag_response_sum = 0;
run;
response_ lag_response_
Obs month sum for_m3 sum
1 1 4877 0 0
2 2 8388 0 0
3 3 8766 0 0
4 4 8491 3 4877
5 5 6953 3 8388
6 6 4823 3 8766
7 7 5761 3 8491
8 8 7638 3 6953
9 9 5613 3 4823
10 10 4258 3 5761
11 11 4001 3 7638
12 12 3426 3 5613
13 13 1414 2 4001
14 14 1347 2 3426
15 15 1117 0 0
16 16 663 3 1414
Hi,
Running your code (after coding the "data=" in the sort) showed:
MPRINT(L3M): data test_data_lag;
MPRINT(L3M): set test_data;
MPRINT(LAGIT): response_lag3=lag0(response_sum);
MPRINT(LAGIT): response_lag3=lag0(response_sum);
MPRINT(LAGIT): response_lag3=lag0(response_sum);
MPRINT(LAGIT): response_lag3=lag3(response_sum);
MPRINT(LAGIT): response_lag3=lag3(response_sum);
MPRINT(LAGIT): response_lag3=lag3(response_sum);
MPRINT(LAGIT): response_lag3=lag3(response_sum);
MPRINT(LAGIT): response_lag3=lag3(response_sum);
MPRINT(LAGIT): response_lag3=lag3(response_sum);
MPRINT(LAGIT): response_lag3=lag3(response_sum);
MPRINT(LAGIT): response_lag3=lag3(response_sum);
MPRINT(LAGIT): response_lag3=lag3(response_sum);
MPRINT(LAGIT): response_lag3=lag2(response_sum);
MPRINT(LAGIT): response_lag3=lag2(response_sum);
MPRINT(LAGIT): response_lag3=lag0(response_sum);
MPRINT(LAGIT): response_lag3=lag3(response_sum);
MPRINT(LAGIT): if for_m3=0 then do;
MPRINT(LAGIT): response_lag3=0;
MPRINT(LAGIT): end;
MPRINT(L3M): ;
MPRINT(L3M): run;
I assume you just wanted one assignment to response_lag3 for each observation.
Each observation on your input data set passes through all of your data step statements, then the data step outputs the result and moves on to the next observation.
From what is shown, "response_lag3=lag3(response_sum)" is the last assignment used to assign a value to response_lag3, unless the if condition at the end is satisfied. That is why you are seeing 4001.
Regards,
Amir.
Message was edited by: Amir Malik - rephrased last sentence.
Just reading the problem briefly, it looks like you are making this much too complicated. Macro langauge shouldn't be needed. LAG is more complicated than it looks, so just compute the lags and use IF/THEN to select what you need. For example:
back1 = lag(response_sum);
back2 = lag2(response_sum);
back3 = lag3(response_sum);
drop back1 back2 back3;
Then follow up with something like:
if for_m3=3 then lagged_value = back3;
Is your objective really any more complex than that?
Doesn't feel like something you should have to use macro language for. If I've understood correctly what you're after then below code should do.
data test_data;
length month 3. response_sum 8. for_m3 8.;
infile datalines delimiter=',';
input month response_sum for_m3;
datalines;
1,4877,0
2,8388,0
3,8766,0
4,8491,3
5,6953,3
6,4823,3
7,5761,3
8,7638,3
9,5613,3
10,4258,3
11,4001,3
12,3426,3
13,1414,2
14,1347,2
15,1117,0
16,663,3
;
run;
data want;
set test_data;
by month;
_select_row = _n_ - for_m3;
if 1 <= _select_row <= _nobs then
do;
set test_data(keep=response_sum rename=(response_sum=response_lag)) point=_select_row nobs=_nobs;
end;
else call missing(response_lag);
run;
Macro is for generating code that is then compiled and run. You cannot use it to reference values of dataset variables during run time. Conditionally selecting a value from a list using the value of another variable as an index is why SAS has the ARRAY construct.
data want ;
set ;
array back (3) _temporary_;
back(1) = lag(response_sum);
back(2) = lag2(response_sum);
back(3) = lag3(response_sum);
if 1 <= for_m3 <=3 then lag_response_sum = back(for_m3);
else lag_response_sum = 0;
run;
response_ lag_response_
Obs month sum for_m3 sum
1 1 4877 0 0
2 2 8388 0 0
3 3 8766 0 0
4 4 8491 3 4877
5 5 6953 3 8388
6 6 4823 3 8766
7 7 5761 3 8491
8 8 7638 3 6953
9 9 5613 3 4823
10 10 4258 3 5761
11 11 4001 3 7638
12 12 3426 3 5613
13 13 1414 2 4001
14 14 1347 2 3426
15 15 1117 0 0
16 16 663 3 1414
Hi all,
Thanks for your quick responses.
I think i have simplified my problem a little too much. In my original post i was useing these 2 variables as an exmaple: "response_sum" and "for_m3". In reality, "response_sum" is 1 of 17 variables i need to create. and "form_m3" is 1 of 4 lag values (lag1, lag2, lag3 and lag12).
The conditional If statement works well for 1 lag for 1 variable but i was looking for something a little more succinct and compact. Here is the extended / complete macro code (apologies for not having the code toe create the test data).
%macro lagit(lag_num);
%do b=1 %to &obs_for_macro.;
mailed_lag&lag_num.=lag&&for_m&lag_num._&b.(mailed_sum);
response_lag&lag_num.=lag&&for_m&lag_num._&b.(response_sum);
conversion_lag&lag_num.=lag&&for_m&lag_num._&b.(conversion_sum);
medium_na_lag&lag_num.=lag&&for_m&lag_num._&b.(medium_NA_sum);
medium_pc_lag&lag_num.=lag&&for_m&lag_num._&b.(medium_PC_sum);
medium_letter_lag&lag_num.=lag&&for_m&lag_num._&b.(medium_Letter_sum);
medium_email_lag&lag_num.=lag&&for_m&lag_num._&b.(medium_Email_sum);
medium_SMS_lag&lag_num.=lag&&for_m&lag_num._&b.(medium_SMS_sum);
medium_Call_lag&lag_num.=lag&&for_m&lag_num._&b.(medium_Call_sum);
medium_Other_lag&lag_num.=lag&&for_m&lag_num._&b.(medium_Other_sum);
gender_male_lag&lag_num.=lag&&for_m&lag_num._&b.(gender_Male_sum);
gender_female_lag&lag_num.=lag&&for_m&lag_num._&b.(gender_Female_sum);
gender_unknown_lag&lag_num.=lag&&for_m&lag_num._&b.(gender_Unknown_sum);
Total_Pack_Cost_lag&lag_num.=lag&&for_m&lag_num._&b.(Total_Pack_Cost);
Conversion_lag&lag_num.=lag&&for_m&lag_num._&b.(Conversion_sum);
%end;
if for_m&lag_num.=0 then
do;
mailed_lag&lag_num.=0;
response_lag&lag_num.=0;
conversion_lag&lag_num.=0;
medium_na_lag&lag_num.=0;
medium_pc_lag&lag_num.=0;
medium_letter_lag&lag_num.=0;
medium_email_lag&lag_num.=0;
medium_SMS_lag&lag_num.=0;
medium_Call_lag&lag_num.=0;
medium_Other_lag&lag_num.=0;
gender_male_lag&lag_num.=0;
gender_female_lag&lag_num.=0;
gender_unknown_lag&lag_num.=0;
Total_Pack_Cost_lag&lag_num.=0;
Conversion_lag&lag_num.=0;
end;
%mend;
%macro L3m();
data test_data_lag;
set test_data;
%lagit(1);
%lagit(2);
%lagit(3);
%lagit(12);
run;
%mend l3m;
%L3m();
Thanks again...
I believe you could use the code I've previously posted and simply add the additional variables to the rename statement.
As I understand you also in your real data there is still only one variable used to store how many rows backwards relative to the current row you have to read all these "lag" data from. So why not just use this variable and actually go and read the values from this row. In doing so you don't have to create all these lag() variables and you don't need to know in advance all the possible "lags".
Thanks for all the help. Finally settled on the following code:
%macro lag_months(time,var,var_new);
array back_&time._ (12) _temporary_ ;
back_&time._(1)=lag1(&var.);
back_&time._(2)=lag2(&var.);
back_&time._(3)=lag3(&var.);
back_&time._(4)=lag4(&var.);
back_&time._(5)=lag5(&var.);
back_&time._(6)=lag6(&var.);
back_&time._(7)=lag7(&var.);
back_&time._(8)=lag8(&var.);
back_&time._(9)=lag9(&var.);
back_&time._(10)=lag10(&var.);
back_&time._(11)=lag11(&var.);
back_&time._(12)=lag12(&var.);
if 1 <= for_m1 <= 1 then
do;
&var_new._lag1 = back_&time._(for_m1);
end;
else &var_new._lag1 = 0;
if 1 <= for_m2 <= 2 then
do;
&var_new._lag2 = back_&time._(for_m2);
end;
else &var_new._lag2 = 0;
if 1 <= for_m3 <= 3 then
do;
&var_new._lag3 = back_&time._(for_m3);
end;
else &var_new._lag3 = 0;
if 1 <= for_m12 <= 12 then
do;
&var_new._lag12 = back_&time._(for_m12);
end;
else &var_new._lag12 = 0;
%mend;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.