Re: how to program like this? - SAS Support Communities

Fluorite | Level 6

Fluorite | Level 6

how to program like this?

Posted 08-29-2012 07:07 AM (3485 views)

Is there one function which can return a value stored in the next observation on the opposite of LAG function ?For example,

data have;

input x;

datalines;

1

2

3

4

5

6

7

8

9

10

;

run;

I want to assign the average value of the next 5 observations to one new variable. How to achieve this? Thanks in advance.

21 REPLIES 21

Jade | Level 19

Jade | Level 19

Re: how to program like this?

Posted 08-29-2012 08:00 AM (3081 views) | In reply to TomiKong

How big is your dataset? If it is not horribly long, try PROC SORT with a descending option and an out=newdata, and then apply the mean function to the lagged variables in newdata. I know there was something about "look ahead" functions in a thread recently, but I have made this method work before.

Steve Denham

Opal | Level 21

Opal | Level 21

Re: how to program like this?

Posted 08-29-2012 08:29 AM (3081 views) | In reply to TomiKong

What do you want to assign to records 6 thru 10?

The look forward method that Steve mentioned may have been the one that uses multiple set statements. e.g.:

data want;

set have;

set have ( firstobs = 2 keep = x rename = (x = _x1) )

have ( obs = 1 drop = _all_ );

set have ( firstobs = 3 keep = x rename = (x = _x2) )

have ( obs = 1 drop = _all_ )

have ( obs = 1 drop = _all_ );

set have ( firstobs = 4 keep = x rename = (x = _x3) )

have ( obs = 1 drop = _all_ )

have ( obs = 1 drop = _all_ )

have ( obs = 1 drop = _all_ );

set have ( firstobs = 5 keep = x rename = (x = _x4) )

have ( obs = 1 drop = _all_ )

have ( obs = 1 drop = _all_ )

have ( obs = 1 drop = _all_ )

have ( obs = 1 drop = _all_ );

set have ( firstobs = 6 keep = x rename = (x = _x5) )

have ( obs = 1 drop = _all_ )

have ( obs = 1 drop = _all_ )

have ( obs = 1 drop = _all_ )

have ( obs = 1 drop = _all_ )

have ( obs = 1 drop = _all_ );

mean_x=mean(of _x:);

run;

I think that would be the most efficient method. Conversely, if you have proc expand, you could use it to expand the dataset, and then use another datastep to calculate the mean. e.g.:

proc expand data=have out=want2 (drop=time);

convert x=x1 / method=none transformout=(lead 1);

convert x=x2 / method=none transformout=(lead 2);

convert x=x3 / method=none transformout=(lead 3);

convert x=x4 / method=none transformout=(lead 4);

convert x=x5 / method=none transformout=(lead 5);

run;

PROC Star

Re: how to program like this?

Posted 08-29-2012 09:24 AM (3081 views) | In reply to TomiKong

OK, here's my shot at it. I assumed that when we get to observation 6, and there are only 4 foward-looking observations, that we should take the mean of those 4:

data test;

do i=1 to 10;

output;

end;

run;

data test2;

set test nobs=_nobs_;

array forward {0:4} forward1-forward5;

retain forward1-forward5;

if _n_=1 then do _j_=2 to 5;

set test (rename=(i=forward_value)) point=_j_;

forward{mod(_j_,5)}=forward_value;

end;

if _n_+5 <= _nobs_ then do;

_j_=_n_+5;

set test (rename=(i=forward_value)) point=_j_;

forward{mod(_n_,5)}=forward_value;

end;

else forward{mod(_n_,5)}=.;

forward_5_mean = mean(of forward{*});

put _all_;

run;

Lapis Lazuli | Level 10

Lapis Lazuli | Level 10

Re: how to program like this?

Posted 08-29-2012 10:28 AM (3081 views) | In reply to TomiKong

data have;
do var=1 to 10;
output;
end;
run;
data temp;
array v(0:4) _temporary_;
set have;
v(mod(_n_,5))=var;
if _n_>=6 then do; mean_5=mean( of v(*)); output;end;
keep mean_5;
run;
data want;
merge have temp;
proc print;run;
obs var mean_5

                                         1      1       4
                                         2      2       5
                                         3      3       6
                                         4      4       7
                                         5      5       8
                                         6      6       .
                                         7      7       .
                                         8      8       .
                                         9      9       .
                                        10     10       .

Onyx | Level 15

Onyx | Level 15

Re: how to program like this?

Posted 08-29-2012 01:03 PM (3081 views) | In reply to TomiKong

Shamelessly stealing Linlin' idea, but modified to finish in one data step:

data have;

do x=1 to 10;

output;

end;

run;

data want (drop=x rename=y=x) ;

array nx5(0:5) _temporary_;

set have end=last nobs=nobs;

nx5(mod(_n_,6))=x;

if _n_>5 then do;

y=nx5(mod(_n_-5,6));

avg=(sum(of nx5(*))-y)/(n(of nx5(*))-1);

output;

end;

if last then do ;

do _n_=nobs+1 to nobs+5;

y=nx5(mod(_n_-5,6));

call missing(nx5(mod(_n_-5,6)),nx5(mod(_n_-6,6)));

avg=mean(of nx5(*));

output;

end;

end;

run;

proc print;run;

Haikuo

Rhodochrosite | Level 12

Rhodochrosite | Level 12

Re: how to program like this?

Posted 08-29-2012 06:36 PM (3081 views) | In reply to TomiKong

hi ... another idea

though it's grossly inefficient to keep reading five new values from the data set over-and-over again,

it took < 1.5 seconds to process a million observations (on my "not so fast" PC)

data want (keep=x mean);

set have nobs=obs;

if (obs - _n_) ge 5 then do;

do nrec=_n_ + 1 to _n_ + 5;

set have (rename=(x=y)) point=nrec;

five = sum(five, y);

end;

mean = five / 5;

end;

else call missing (mean);

run;

Opal | Level 21

Opal | Level 21

Re: how to program like this?

Posted 08-29-2012 06:59 PM (3081 views) | In reply to MikeZdeb

Mike,

Since this is a more automated version of the code that I originally proposed (okay .. I stole it from Howard's sasCommunity.or Look Ahead Look Back article), I think that you really want to include a keep option in the additional set statements.

Rhodochrosite | Level 12

Rhodochrosite | Level 12

Re: how to program like this?

Posted 08-29-2012 07:05 PM (3081 views) | In reply to art297

hi Art ... I was assuming the one variable data set that was in the original question

Lapis Lazuli | Level 10

Lapis Lazuli | Level 10

Re: how to program like this?

Posted 08-29-2012 09:31 PM (3081 views) | In reply to MikeZdeb

Hi Mike,

If you are going to try Astounding's code you have to try my code(I learned from Ksharp) as well:smileysilly:.

PROC Star

Re: how to program like this?

Posted 08-29-2012 08:20 PM (3083 views) | In reply to MikeZdeb

Mike,

Did you try my version as well? It looks longer, but it reads even less data. It recognizes that 4 of the 5 records that you are reading with point= don't have to be read. They were just read on the previous observation, and can be retained instead.

Just remember to add keep= , and remove the put statement at the end.

Astounding

PROC Star

Re: how to program like this?

Posted 08-30-2012 02:11 PM (3083 views) | In reply to Astounding

OK, I found some time. I tested my code vs. Mike's and found that mine took less than 50% of the time.

With apologies, I didn't test other approaches but I know from past experience that MERGE takes much longer than SET.

Rhodochrosite | Level 12

Rhodochrosite | Level 12

Re: how to program like this?

Posted 08-30-2012 04:20 PM (3083 views) | In reply to Astounding

hi ... I did say "though it's grossly inefficient ..."

so I modified my posting to do the same thing as Astounding's (his better idea of just remove one value and add one value to the total of X across five observations) ...

data want (keep=x mean);

* add first five values;

do nrec=1 to 5;

set have nobs=obs;

five = sum(five,x);

end;

* drop one value, add one value;

do nrec=6 by 1 until(done);

set have end=done;

set have (keep=x rename=(x=y)) point=nrec;

five = five - x + y;

mean = five / 5;

if nrec gt obs then call missing(mean);

output;

end;

stop;

run;

I tried the above versus Astounding's code with various size data sets (up to 1 million obs) and they take the same amount of time (with 1 million observations, each takes about 0.5 seconds of CPU time)

I also tried Linlin's and Haikuo's with the 1 million obs ... both were about the same amount of time as above, 0.5 seconds

last, I tried KSharp's really elegant looking SQL code (that really is clever), stopping at 10,000 obs since even that takes about 11 seconds of CPU time

PROC Star

Re: how to program like this?

Posted 08-30-2012 04:48 PM (3083 views) | In reply to MikeZdeb

Mike,

I like your new structure ... much easier to read than mine.

A quick question about testing Linlin's code ... it looks like you were able to use this syntax on a temporary array:

mean (of v(*))

I got an error trying that, but then I'm using 9.1. I guess a later release must have added functions across temporary arrays.

Thanks for testing and posting.

Rhodochrosite | Level 12

Rhodochrosite | Level 12

Re: how to program like this?

Posted 08-30-2012 05:18 PM (3083 views) | In reply to Astounding

hi ... correct, I just tried that in V9.1 and got ...

3 data _null_;

4 array v(5) _temporary_ (1 2 3 4 5);

5 mean = mean(of v(*));

ERROR: The ARRAYNAME

specification requires a variable based array.

6 put mean=;

7 run;

one reason to use that temporary array in Linlin's code is that such arrays are automatically retained

in V9.1, you can change that temporary array to a variable based array and add a RETAIN statement

then Linlin's code will work

data temp;

array v(0:4);

retain v;

set have;

v(mod(_n_,5))=var;

if _n_>=6 then do; mean_5=mean( of v(*)); output;end;

keep mean_5;

run;

Discussion stats

21 replies
‎08-29-2012 07:07 AM
3486 views
8 likes
9 in conversation