Hi,
I have data:
month state ts mt g
6 1 120 130 .
5 1 115 120 -0.04167
4 1 110 115 -0.04348
3 1 100 . -0.09091
2 1 95 . -0.05000
1 1 90 . -0.05263
6 2 1200 1300 .
5 2 1125 1200 -0.06250
4 2 1050 1150 -0.06667
3 2 950 . -0.09524
2 2 1100 . 0.15789
1 2 1000 . -0.09091
I need to do the following:
For observations where mt not missing calculate x=lag(mt)*(1+g)
but for observations where mt is missing use x=lag(x)*(1+g)
And start again for each state.
So I think I need to retain the values for x.
Note that in practice I will have many observation per state.
Thanks,
Below based on your description and stuff I've seen in the code you've posted. Does that give you the result you're after? If not then what's missing/incorrect?
/* Sample dataset */
data temp1;
  input month state ts mt;
  datalines;
1 1 90 .
2 1 95 .
3 1 100 .
4 1 110 115
5 1 115 120
6 1 120 130
1 2 1000 .
2 2 1100 .
3 2 950 .
4 2 1050 1150
5 2 1125 1200
6 2 1200 1300
;
run;
proc sort data=temp1 out=inter;
  by state DESCENDING month;
run;
/*For observations where mt not missing calculate x=lag(mt)*(1+g)*/
/*but for observations where mt is missing use x=lag(x)*(1+g)*/
/*g=ts/lag(ts)-1;*/
/*And start again for each state.*/
data want;
  set inter;
  by state DESCENDING month;
  retain x 0;
  x=coalesce(lag(mt),x)*(1+(ts/lag(ts)-1));
  if first.state then call missing(x);
run;
Might help to share the entire data step you are currently attempting. Also describe what is not working.
I'll go on a limb an guess that you have some code like
if not missing(mt) then x=lag(mt)*(1+g);
else x=lag(x)*(1+g);
And the results look "funny". The issue would be the nature of the lag function and the queue it maintains. So LAG when used with if refers to the last time the IF was true.
Better is to create a temporary variable to always have the last value and then use that as needed.
Lmt=Lag(mt);
if not missing(mt) then x=Lmt*(1+g);
else x=Lmt*(1+g);
drop lmt;
Since it appears that you should be using a BY state or similar then you could interrupt that use using FIRST. processing:
Lmt=Lag(mt);
if first.state then lmt=.; /*would prevent the first record of a state from using the previous value of mt.*/
if not missing(mt) then x=Lmt*(1+g);
else x=Lmt*(1+g);
drop lmt;
If you need a special value of x for the first record of a state then add a do block:
if first.state then do;
x= <something>;
end;
Else if not missing(mt) then x=Lmt*(1+g);
else x=Lmt*(1+g);
Below based on your description and stuff I've seen in the code you've posted. Does that give you the result you're after? If not then what's missing/incorrect?
/* Sample dataset */
data temp1;
  input month state ts mt;
  datalines;
1 1 90 .
2 1 95 .
3 1 100 .
4 1 110 115
5 1 115 120
6 1 120 130
1 2 1000 .
2 2 1100 .
3 2 950 .
4 2 1050 1150
5 2 1125 1200
6 2 1200 1300
;
run;
proc sort data=temp1 out=inter;
  by state DESCENDING month;
run;
/*For observations where mt not missing calculate x=lag(mt)*(1+g)*/
/*but for observations where mt is missing use x=lag(x)*(1+g)*/
/*g=ts/lag(ts)-1;*/
/*And start again for each state.*/
data want;
  set inter;
  by state DESCENDING month;
  retain x 0;
  x=coalesce(lag(mt),x)*(1+(ts/lag(ts)-1));
  if first.state then call missing(x);
run;
Yes! That is what I was looking for. Thanks very much!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
