BookmarkSubscribeRSS Feed
Marilyn2015
Calcite | Level 5

I have a problem with lag function in do loop. I want to get lag1, lag2, lag3,...,lagn but without using macro so I tried to do it in this way but it doesn't work.

retain S 0;

do i=1 to 10;

S=S+lag(x);

end;

Has anybody any solutions?

10 REPLIES 10
Steelers_In_DC
Barite | Level 11

can you provide a small example of data that you have and data that you are trying to get?

Marilyn2015
Calcite | Level 5

data example;

input id price;

datalines;

1 437.83

2 458.5

3 457

4 456.98

5 459.11

6 453.91

7 444.05

8 456.47

9 463.25

10 474

;

run;

and I want to count WMA(Weigthed Moving Average) of prices. It looks like this for period n:

WMA=(n*Price+(n-1)*lag(Price)+...+1*lag(n-1)(Price))/(1+...+n).

ballardw
Super User

Do not use + in these calculations if you want the partial moving average for the first records as lag(n), in your notation, will not have a value until record n+1. Any calculation using + or *with missing will yield missing. So use SUM.

Astounding
PROC Star

Just to clarify, then, please confirm ...

The WMA calculation should use a different number of terms on each observation with no limit on the maximum number of terms?

The calculations should never start over mid-stream when some other variable (not shown in the example) changes?

Thanks!

Astounding
PROC Star

Well, it's not clear what you are trying to accomplish here.  But it is likely that macro language would not be involved.  If you want separate values for lag1, ..., lagn, just code it:

data want;

   set have;

   lag1 = lag1(x);

   lag2 = lag2(x);

   ...

   lagn = lagn(x);

run;

If you want the sum of the prior 10 values of x, you only need to use two lag functions:

data want;

   set have;

   back1 = lag(x);

   back11 = lag11(x);

   S + back1;

   if back11 > . then S = S - back11;

run;

You might have to tweak the logic, depending on the result you want for the first 10 observations.

if it's something else entirely that you are seeking, you'll have to elaborate.

Good luck.

ballardw
Super User

If you need to do something with a parameter to pick which lag you might need to look into SAS/ IML where the data is a matrix.

The other solution is to create all of the lagged variables you need:

lagx1= lag(x);

lagx2= lag2(x);

etc.

array lags lagx1 - lagx10;

do I= 1 to 10;

S= sum(S,lags);

end;

drop lagx: ;

I use sum as for the first record Lagx1 to lagx10 will be missing, for the second record lag2 to lag10 will be missing. So using + you wouldn't have gotten any values accumulated for S until the 11th record because + missing yields missing.

user24feb
Barite | Level 11

It seems that you would like to calculate some kind if simple moving average (as Astounding described). However, if you simply would like to get the "i" attached to the lag function, you could try call execute:

Data A (Drop=i);
  Do i=1 To 100;
    x=1;
    Output;
  End;
Run;

Data _NULL_; 
  Call Execute ('Data B;');
  Call Execute ('  Set A;');
  Do i=1 To 10;
    Call Execute ('  s=Sum(s,lag'!!Compress(Put(i,2.))!!'(x));');
  End;
  Call Execute ('  Output;');
  Call Execute ('Run;');
Run;

Haikuo
Onyx | Level 15

If you have SAS/ETS, PROC EXPAND would be the first place you want to take a look, it may not get what you want at one shot though.

The core of your issue, I suppose, is that you want to attain a decent amount of flexibility/ scalability while not to use Macro Code-Gen to jam your SAS code Smiley Happy? LAG() is nothing but a queue function, there are other ways to realize queues without using LAG(), one of which is Hash. Although one can construct a dynamic Hash or Array at the size of your lagging range, it is heck lot easier to dump everything into the Hash, with a key that maps data step sequenced iterations.

data example;

     input id price;

     n=_n_;

     datalines;

1 437.83

2 458.5

3 457

4 456.98

5 459.11

6 453.91

7 444.05

8 456.47

9 463.25

10 474

;

data want;

     retain _lg 4;/*the lagging range*/

     if _n_=1 then

           do;

                dcl hash h(dataset:'example (keep=n price )', ordered:'a');

                h.definekey('n');

                h.definedata('price');

                h.definedone();

                  call missing (n,price);

           end;

     set example(keep=id);

     if _n_>=_lg then

           do;

                do _i=0 to _lg-1;

                     _rc=h.find(key:_n_-_i);

                     _nom+(_lg-_i)*price;

                     _denom+(_i+1);

                end;

                wma=_nom/_denom;

                output;

                call missing (_nom, _denom);

           end;

     drop n price _:;

run;

Haikuo
Onyx | Level 15

And here is a dynamic Hash approach, in case your computer memory is not large enough:

data example;

     input id price;

     datalines;

1 437.83

2 458.5

3 457

4 456.98

5 459.11

6 453.91

7 444.05

8 456.47

9 463.25

10 474

;

data want_dym;

     retain _lg 4;/*the lagging range, this maps to the 'n-1' in LAGn()*/

     if _n_=1 then

           do;

                dcl hash h(ordered:'a');

                h.definekey('_n_');

                h.definedata('price');

                h.definedone();

                dcl hiter hi('h');

           end;

     set example;

     _rc=h.add();

     if _n_>=_lg then

          do;

                do _i=0 to _lg-1;

                     _rc=h.find(key:_n_-_i);

                     _nom+(_lg-_i)*price;

                     _denom+(_i+1);

                end;

                wma=_nom/_denom;

                output;

                call missing (_nom, _denom);

                _rc=h.remove(key:_n_-_lg+1);

           end;

     drop price _:;

run;

Ksharp
Super User

SQL can do this:

data example;
input id price;
datalines;
1 437.83
2 458.5
3 457
4 456.98
5 459.11
6 453.91
7 444.05
8 456.47
9 463.25
10 474
;
run;
%let lag=10 ;
proc sql;
 create table want as
  select *,(select sum(id*price)/sum(id) from example where id between a.id-&lag and a.id) as wma
   from example as a;
quit;

Xia Keshan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 2700 views
  • 0 likes
  • 7 in conversation