I'm trying to understand how PROC EXPAND works and why one calculation isn't working the way I expect.
I don't understand why the third one doesn't work the way as expected.
The first two calculations are correct, but LEAD doesn't generate the data expected.
From this original discussion:
http://stackoverflow.com/q/42395126/1919583
*create sample data to work with;
data random;
call streaminit(25);
do date='01Jan2016'd to '31Dec2016'd;
x=round(Rand('normal', 100, 15), 0.01);
output;
format date date9.;
end;
run;
*Modified calculation;
proc expand data=random out=want;
id date;
convert x= x_movSum /transformout = (movsum 10 trimleft 9); /*This is the correct calculation expected*/
convert x = x_movSumFIRST / transformout = (reverse movsum 10 trimleft 9 reverse);
convert x = x_movSumWRONG /transformout = (lead 10 reverse movesum 10 reverse); /*suggest via Tom*/
run;
Hello -
If I understand your question correctly you are mostly wondering about the effect of "lead 10", correct?
proc expand data=random out=want;
id date;
convert x = x_movSumWRONG /transformout = (lead 10 reverse movesum 10 reverse); /*suggest via Tom*/
run;
Your data (random) looks like this - leaving out some data in the middle:
| date | x |
| 1-Jan-16 | 114.68 |
| 2-Jan-16 | 83.68 |
| 3-Jan-16 | 111.5 |
| 4-Jan-16 | 86.82 |
| 5-Jan-16 | 103.46 |
| 6-Jan-16 | 92.85 |
| 7-Jan-16 | 88.22 |
| 8-Jan-16 | 98.48 |
| 9-Jan-16 | 99.5 |
| 10-Jan-16 | 132.27 |
| 11-Jan-16 | 87.05 |
| 12-Jan-16 | 99.98 |
| 13-Jan-16 | 75.69 |
| … | … |
| 20-Dec-16 | 93.67 |
| 21-Dec-16 | 114.5 |
| 22-Dec-16 | 94.1 |
| 23-Dec-16 | 96.01 |
| 24-Dec-16 | 104.5 |
| 25-Dec-16 | 104.69 |
| 26-Dec-16 | 116.58 |
| 27-Dec-16 | 98.41 |
| 28-Dec-16 | 103.41 |
| 29-Dec-16 | 110.42 |
| 30-Dec-16 | 90.44 |
| 31-Dec-16 | 88.62 |
If you decompose your "wrong" statement into:
proc expand data=random out=want;
id date;
convert x = x_lead10 /transformout = (lead 10);
convert x = x_reverse /transformout = (lead 10 reverse);
convert x = x_movsum10 /transformout = (lead 10 reverse movsum 10);
convert x = x_reverseII /transformout = (lead 10 reverse movsum 10 reverse);
run;
you will see these columns in WANT.
In my mind this is the expected behaviour - it is just not the same what you'd like to archive with:
convert x= x_movSum /transformout = (movsum 10 trimleft 9);
Hope this makes sense,
Udo
Hello -
If I understand your question correctly you are mostly wondering about the effect of "lead 10", correct?
proc expand data=random out=want;
id date;
convert x = x_movSumWRONG /transformout = (lead 10 reverse movesum 10 reverse); /*suggest via Tom*/
run;
Your data (random) looks like this - leaving out some data in the middle:
| date | x |
| 1-Jan-16 | 114.68 |
| 2-Jan-16 | 83.68 |
| 3-Jan-16 | 111.5 |
| 4-Jan-16 | 86.82 |
| 5-Jan-16 | 103.46 |
| 6-Jan-16 | 92.85 |
| 7-Jan-16 | 88.22 |
| 8-Jan-16 | 98.48 |
| 9-Jan-16 | 99.5 |
| 10-Jan-16 | 132.27 |
| 11-Jan-16 | 87.05 |
| 12-Jan-16 | 99.98 |
| 13-Jan-16 | 75.69 |
| … | … |
| 20-Dec-16 | 93.67 |
| 21-Dec-16 | 114.5 |
| 22-Dec-16 | 94.1 |
| 23-Dec-16 | 96.01 |
| 24-Dec-16 | 104.5 |
| 25-Dec-16 | 104.69 |
| 26-Dec-16 | 116.58 |
| 27-Dec-16 | 98.41 |
| 28-Dec-16 | 103.41 |
| 29-Dec-16 | 110.42 |
| 30-Dec-16 | 90.44 |
| 31-Dec-16 | 88.62 |
If you decompose your "wrong" statement into:
proc expand data=random out=want;
id date;
convert x = x_lead10 /transformout = (lead 10);
convert x = x_reverse /transformout = (lead 10 reverse);
convert x = x_movsum10 /transformout = (lead 10 reverse movsum 10);
convert x = x_reverseII /transformout = (lead 10 reverse movsum 10 reverse);
run;
you will see these columns in WANT.
In my mind this is the expected behaviour - it is just not the same what you'd like to archive with:
convert x= x_movSum /transformout = (movsum 10 trimleft 9);
Hope this makes sense,
Udo
Makes sense, thanks Udo 🙂
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.