BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Sharpshooter
Calcite | Level 5


Hi,

I'm having real problems correctly creating a value using Lag through a macro.Below is the code and some test data.

I want to lag the response_sum field by the value in the field "for_m3". Using line 14 as an example, I want to resolve to a value of lag2 (which should equal 3426). But instead it puts the value 4001. The log shows this:

MPRINT(LAGIT):   response_lag3=lag2(response_sum);

So I would reasonably expect the value in response_lag3 to be 3426 - but it's 4001


data test_data;
   length month 3. response_sum 8. for_m3 8. ;
  infile datalines delimiter=',';
   input month response_sum for_m3 ;
   datalines;
1,4877,0
2,8388,0
3,8766,0
4,8491,3
5,6953,3
6,4823,3
7,5761,3
8,7638,3
9,5613,3
10,4258,3
11,4001,3
12,3426,3
13,1414,2
14,1347,2
15,1117,0
16,663,3
;
run;

proc sort test_data;
by month;
run;

%macro lagit(lag_num);
%do b=1 %to &obs_for_macro.;
response_lag&lag_num.=lag&&for_m&lag_num._&b.(response_sum);
%end;

if for_m&lag_num.=0 then
  do;
   response_lag&lag_num.=0;
  end;

%mend ;

data _null_;
set test_data;
by month;
call symput("for_m3_"||left(_n_),compress(for_m3));
if last.month then call symput("obs_for_macro",left(_n_));
run;

%macro L3m();
data test_data_lag;
set test_data;
%lagit(3);

run;
%mend l3m;

%L3m();

Thanks in advance for your help.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Macro is for generating code that is then compiled and run.  You cannot use it to reference values of dataset variables during run time.  Conditionally selecting a value from a list using the value of another variable as an index is why SAS has the ARRAY construct.

data want ;

  set ;

  array back (3) _temporary_;

  back(1) = lag(response_sum);

  back(2) = lag2(response_sum);

  back(3) = lag3(response_sum);

  if 1 <= for_m3 <=3 then lag_response_sum = back(for_m3);

  else lag_response_sum = 0;

run;

                response_              lag_response_

Obs    month       sum       for_m3         sum

  1       1        4877         0              0

  2       2        8388         0              0

  3       3        8766         0              0

  4       4        8491         3           4877

  5       5        6953         3           8388

  6       6        4823         3           8766

  7       7        5761         3           8491

  8       8        7638         3           6953

  9       9        5613         3           4823

10      10        4258         3           5761

11      11        4001         3           7638

12      12        3426         3           5613

13      13        1414         2           4001

14      14        1347         2           3426

15      15        1117         0              0

16      16         663         3           1414

View solution in original post

7 REPLIES 7
Amir
PROC Star

Hi,

Running your code (after coding the "data=" in the sort) showed:

MPRINT(L3M):   data test_data_lag;

MPRINT(L3M):   set test_data;

MPRINT(LAGIT):   response_lag3=lag0(response_sum);

MPRINT(LAGIT):   response_lag3=lag0(response_sum);

MPRINT(LAGIT):   response_lag3=lag0(response_sum);

MPRINT(LAGIT):   response_lag3=lag3(response_sum);

MPRINT(LAGIT):   response_lag3=lag3(response_sum);

MPRINT(LAGIT):   response_lag3=lag3(response_sum);

MPRINT(LAGIT):   response_lag3=lag3(response_sum);

MPRINT(LAGIT):   response_lag3=lag3(response_sum);

MPRINT(LAGIT):   response_lag3=lag3(response_sum);

MPRINT(LAGIT):   response_lag3=lag3(response_sum);

MPRINT(LAGIT):   response_lag3=lag3(response_sum);

MPRINT(LAGIT):   response_lag3=lag3(response_sum);

MPRINT(LAGIT):   response_lag3=lag2(response_sum);

MPRINT(LAGIT):   response_lag3=lag2(response_sum);

MPRINT(LAGIT):   response_lag3=lag0(response_sum);

MPRINT(LAGIT):   response_lag3=lag3(response_sum);

MPRINT(LAGIT):   if for_m3=0 then do;

MPRINT(LAGIT):   response_lag3=0;

MPRINT(LAGIT):   end;

MPRINT(L3M):  ;

MPRINT(L3M):   run;

I assume you just wanted one assignment to response_lag3 for each observation.

Each observation on your input data set passes through all of your data step statements, then the data step outputs the result and moves on to the next observation.

From what is shown, "response_lag3=lag3(response_sum)" is the last assignment used to assign a value to response_lag3, unless the if condition at the end is satisfied. That is why you are seeing 4001.

Regards,

Amir.

Message was edited by: Amir Malik - rephrased last sentence.

Astounding
PROC Star

Just reading the problem briefly, it looks like you are making this much too complicated.  Macro langauge shouldn't be needed.  LAG is more complicated than it looks, so just compute the lags and use IF/THEN to select what you need.  For example:

back1 = lag(response_sum);

back2 = lag2(response_sum);

back3 = lag3(response_sum);

drop back1 back2 back3;

Then follow up with something like:

if for_m3=3 then lagged_value = back3;

Is your objective really any more complex than that?

Patrick
Opal | Level 21

Doesn't feel like something you should have to use macro language for. If I've understood correctly what you're after then below code should do.

data test_data;
  length month 3. response_sum 8. for_m3 8.;
  infile datalines delimiter=',';
  input month response_sum for_m3;
  datalines;
1,4877,0
2,8388,0
3,8766,0
4,8491,3
5,6953,3
6,4823,3
7,5761,3
8,7638,3
9,5613,3
10,4258,3
11,4001,3
12,3426,3
13,1414,2
14,1347,2
15,1117,0
16,663,3
;
run;

data want;
  set test_data;
  by month;

  _select_row = _n_ - for_m3;
  if 1 <= _select_row <= _nobs then
    do;
      set test_data(keep=response_sum rename=(response_sum=response_lag)) point=_select_row nobs=_nobs;
    end;
  else call missing(response_lag);
run;

Tom
Super User Tom
Super User

Macro is for generating code that is then compiled and run.  You cannot use it to reference values of dataset variables during run time.  Conditionally selecting a value from a list using the value of another variable as an index is why SAS has the ARRAY construct.

data want ;

  set ;

  array back (3) _temporary_;

  back(1) = lag(response_sum);

  back(2) = lag2(response_sum);

  back(3) = lag3(response_sum);

  if 1 <= for_m3 <=3 then lag_response_sum = back(for_m3);

  else lag_response_sum = 0;

run;

                response_              lag_response_

Obs    month       sum       for_m3         sum

  1       1        4877         0              0

  2       2        8388         0              0

  3       3        8766         0              0

  4       4        8491         3           4877

  5       5        6953         3           8388

  6       6        4823         3           8766

  7       7        5761         3           8491

  8       8        7638         3           6953

  9       9        5613         3           4823

10      10        4258         3           5761

11      11        4001         3           7638

12      12        3426         3           5613

13      13        1414         2           4001

14      14        1347         2           3426

15      15        1117         0              0

16      16         663         3           1414

Sharpshooter
Calcite | Level 5

Hi all,

Thanks for your quick responses.

I think i have simplified my problem a little too much. In my original post i was useing these 2 variables as an exmaple: "response_sum" and  "for_m3". In reality, "response_sum" is 1 of 17 variables i need to create. and "form_m3" is 1 of 4 lag values (lag1, lag2, lag3 and lag12).

The conditional If statement works well for 1 lag for 1 variable but i was looking for something a little more succinct and compact. Here is the extended / complete macro code (apologies for not having the code toe create the test data).


%macro lagit(lag_num);
%do b=1 %to &obs_for_macro.;
mailed_lag&lag_num.=lag&&for_m&lag_num._&b.(mailed_sum);
response_lag&lag_num.=lag&&for_m&lag_num._&b.(response_sum);
conversion_lag&lag_num.=lag&&for_m&lag_num._&b.(conversion_sum);

medium_na_lag&lag_num.=lag&&for_m&lag_num._&b.(medium_NA_sum);
medium_pc_lag&lag_num.=lag&&for_m&lag_num._&b.(medium_PC_sum);
medium_letter_lag&lag_num.=lag&&for_m&lag_num._&b.(medium_Letter_sum);
medium_email_lag&lag_num.=lag&&for_m&lag_num._&b.(medium_Email_sum);
medium_SMS_lag&lag_num.=lag&&for_m&lag_num._&b.(medium_SMS_sum);
medium_Call_lag&lag_num.=lag&&for_m&lag_num._&b.(medium_Call_sum);
medium_Other_lag&lag_num.=lag&&for_m&lag_num._&b.(medium_Other_sum);

gender_male_lag&lag_num.=lag&&for_m&lag_num._&b.(gender_Male_sum);
gender_female_lag&lag_num.=lag&&for_m&lag_num._&b.(gender_Female_sum);
gender_unknown_lag&lag_num.=lag&&for_m&lag_num._&b.(gender_Unknown_sum);

Total_Pack_Cost_lag&lag_num.=lag&&for_m&lag_num._&b.(Total_Pack_Cost);
Conversion_lag&lag_num.=lag&&for_m&lag_num._&b.(Conversion_sum);
%end;
if for_m&lag_num.=0 then
  do;
   mailed_lag&lag_num.=0;
   response_lag&lag_num.=0;
   conversion_lag&lag_num.=0;

   medium_na_lag&lag_num.=0;
   medium_pc_lag&lag_num.=0;
   medium_letter_lag&lag_num.=0;
   medium_email_lag&lag_num.=0;
   medium_SMS_lag&lag_num.=0;
   medium_Call_lag&lag_num.=0;
   medium_Other_lag&lag_num.=0;
 
   gender_male_lag&lag_num.=0;
   gender_female_lag&lag_num.=0;
   gender_unknown_lag&lag_num.=0;

   Total_Pack_Cost_lag&lag_num.=0;
   Conversion_lag&lag_num.=0;
  end;
%mend;

%macro L3m();
data test_data_lag;
set test_data;
%lagit(1);
%lagit(2);
%lagit(3);
%lagit(12);
run;
%mend l3m;

%L3m();

Thanks again...

Patrick
Opal | Level 21

I believe you could use the code I've previously posted and simply add the additional variables to the rename statement.

As I understand you also in your real data there is still only one variable used to store how many rows backwards relative to the current row you have to read all these "lag" data from. So why not just use this variable and actually go and read the values from this row. In doing so you don't have to create all these lag() variables and you don't need to know in advance all the possible "lags".

Sharpshooter
Calcite | Level 5


Thanks for all the help. Finally settled on the following code:

%macro lag_months(time,var,var_new);
array back_&time._ (12) _temporary_ ;
back_&time._(1)=lag1(&var.);
back_&time._(2)=lag2(&var.);
back_&time._(3)=lag3(&var.);
back_&time._(4)=lag4(&var.);
back_&time._(5)=lag5(&var.);
back_&time._(6)=lag6(&var.);
back_&time._(7)=lag7(&var.);
back_&time._(8)=lag8(&var.);
back_&time._(9)=lag9(&var.);
back_&time._(10)=lag10(&var.);
back_&time._(11)=lag11(&var.);
back_&time._(12)=lag12(&var.);

if 1 <= for_m1 <= 1 then
do;
  &var_new._lag1 = back_&time._(for_m1);
end;
else &var_new._lag1 = 0;
if 1 <= for_m2 <= 2 then
do;
  &var_new._lag2 = back_&time._(for_m2);
end;
else &var_new._lag2 = 0;
if 1 <= for_m3 <= 3 then
do;
   &var_new._lag3 = back_&time._(for_m3);
end;
else &var_new._lag3 = 0;
if 1 <= for_m12 <= 12 then
do;
   &var_new._lag12 = back_&time._(for_m12);
end;
else &var_new._lag12 = 0;
%mend;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1751 views
  • 3 likes
  • 5 in conversation