Hello SAS community,
I want to calculate a rolling sum, which includes a variable that needs to be updated for each row.
The data I have is as follows:
id x z
1 -0.1 1000
1 0.2 .
1 0.3 .
2 -0.2 1000
2 0.1 .
2 -0.3 .
I need to fill in the missing values of Z as follows:
For the first row of each id form a new variable y equal to 1000- 10 +2*x. Then, the next value of z which is missing should be filled with the lagged value of y. This process starts again afresh for id=2 and so on.
id x z y
1 -0.1 1000 1000-10+2*(-0.1)=989.8
1 0.2 989.8 989.8-10+2*0.2=980.2
1 0.3 980.2 970.6
2 -0.2 1000 1000-10+2*(-0.2)=989.6
2 0.1 989.6 979.8
2 -0.3 979.8 969.2
Thank you very much in advance.
Costas
data want;
set have;
retain y;
if missing(z) then z = y;
y = z - 10 + 2 * x;
run;
Result:
id x z y
1 -0.1 1000 989.8
1 0.2 989.8 980.2
1 0.3 980.2 970.8
2 -0.2 1000 989.6
2 0.1 989.6 979.8
2 -0.3 979.8 969.2
data want;
set have;
retain y;
if missing(z) then z = y;
y = z - 10 + 2 * x;
run;
Result:
id x z y
1 -0.1 1000 989.8
1 0.2 989.8 980.2
1 0.3 980.2 970.8
2 -0.2 1000 989.6
2 0.1 989.6 979.8
2 -0.3 979.8 969.2
Thank you, @FloydNevseta this works perfectly.
The codes provided by @ballardw @Reeza and @Kurt_Bremser also do what I want. Seems that the correct use of retain is the key here.
As advised, in the future, I will post a SAS data set to make things easier.
Thank you all very much!
Use a RETAINed variable:
data want;
set have;
by id;
retain y;
if first.id
then y = z;
else z = y;
y = y - 10 + 2 * x;
run;
Untested, posted from my tablet.
Slightly different answers than you but I think this is the right idea:
data have;
input id $ x z;
cards;
1 -0.1 1000
1 0.2 .
1 0.3 .
2 -0.2 1000
2 0.1 .
2 -0.3 .
;;;;
data want;
set have;
by id;
retain y;
if first.id then y = z;
y = y-10+2*x;
run;
Results:
Obs id x z y
1 1 -0.1 1000 989.8
2 1 0.2 . 980.2
3 1 0.3 . 970.8
4 2 -0.2 1000 989.6
5 2 0.1 . 979.8
6 2 -0.3 . 969.2
@costasRO wrote:
Hello SAS community,
I want to calculate a rolling sum, which includes a variable that needs to be updated for each row.
The data I have is as follows:
id x z
1 -0.1 1000
1 0.2 .
1 0.3 .
2 -0.2 1000
2 0.1 .
2 -0.3 .
I need to fill in the missing values of Z as follows:
For the first row of each id form a new variable y equal to 1000- 10 +2*x. Then, the next value of z which is missing should be filled with the lagged value of y. This process starts again afresh for id=2 and so on.
id x z y
1 -0.1 1000 1000-10+2*(-0.1)=989.8
1 0.2 989.8 989.8-10+2*0.2=980.2
1 0.3 980.2 970.6
2 -0.2 1000 1000-10+2*(-0.2)=989.6
2 0.1 989.6 979.8
2 -0.3 979.8 969.2
Thank you very much in advance.
Costas
If you use 980.2- 10 +2*x for the third row you do not get 970.6 you get 970.8.
data have; input id x z; datalines; 1 -0.1 1000 1 0.2 . 1 0.3 . 2 -0.2 1000 2 0.1 . 2 -0.3 . ; data want; set have; by id; retain y; if first.id then y= z-10+2*x; else do; z=y; y= y-10+2*x; end; run;
Please try to provide data in the form of data step code. We may make assumptions about values that do not exist about your data just showing values.
Retain will keep the value of a variable across the data step boundary.
The By statement creates automatic variables First. and Last. that are numeric 1 or 0 to indicate whether the current observation is the first or last of that variables by group (1=true 0=false) and can be used to set/reset values conditionally at starts and end.
After that it is just a matter of timing in the code when to assign z and when to calculate the new y.
If you don't actually need the value of Y in the output set you can Drop it.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.