Hey SAS'ers,
I'm trying to use a lag function within by groups. please see my desired output for an idea of what i'm going for.
Thank you!
/*Begin Code*/
%let intercept = 0.236917;
%let unemployment = 0.010734;
%let permits = -0.03398;
Data Have;
input State $ i urate bp date Initial_Vrate;
informat State $10. i urate bp date Initial_Vrate Best12.;
format date monyy. urate Initial_Vrate percent8.2;
cards;
Alabama 1 0.11766 6.78106 18263 0.13061
Alabama 2 0.11367 6.77765 18294 0.13061
Alabama 3 0.10847 7.20266 18322 0.13061
Alabama 4 0.09303 7.04054 18353 0.13061
Alabama 5 0.09025 6.96319 18383 0.13061
Alaska 1 0.09307 3.17805 18263 0.14659
Alaska 2 0.09553 3.80666 18294 0.14659
Alaska 3 0.09407 3.82864 18322 0.14659
Alaska 4 0.08254 4.60517 18353 0.14659
Alaska 5 0.07608 5.05625 18383 0.14659
Arizona 1 0.09728 7.13728 18263 0.1616
Arizona 2 0.09774 7.02198 18294 0.1616
Arizona 3 0.09384 7.42536 18322 0.1616
Arizona 4 0.09727 7.04054 18353 0.1616
Arizona 5 0.09691 6.88244 18383 0.1616
;
run;
data want;
set have;
by state;
if first.state then Modeled_Vacancy = Initial_Vrate;
else Modeled_Vacancy = lag(Modeled_Vacancy) + &intercept. + (&unemployment. *lag(urate)) + (&permits. *lag(bp));
format Modeled_Vacancy percent8.2;
run;
/* End of Code - Output should look like the following: */
State | i | urate | bp | date | initial_vrate | Modeled_Vrate |
Alabama | 1 | 11.8% | 6.78106 | Jan-10 | 13.1% | 13.1% |
Alabama | 2 | 11.4% | 6.77765 | Feb-10 | 13.1% | 13.8% |
Alabama | 3 | 10.8% | 7.20266 | Mar-10 | 13.1% | 14.6% |
Alabama | 4 | 9.3% | 7.04054 | Apr-10 | 13.1% | 14.0% |
Alabama | 5 | 9.0% | 6.96319 | May-10 | 13.1% | 13.8% |
Alaska | 1 | 9.3% | 3.17805 | Jan-10 | 14.7% | 14.7% |
Alaska | 2 | 9.6% | 3.80666 | Feb-10 | 14.7% | 27.7% |
Alaska | 3 | 9.4% | 3.82864 | Mar-10 | 14.7% | 38.5% |
Alaska | 4 | 8.3% | 4.60517 | Apr-10 | 14.7% | 49.3% |
Alaska | 5 | 7.6% | 5.05625 | May-10 | 14.7% | 57.4% |
Arizona | 1 | 9.7% | 7.13728 | Jan-10 | 16.2% | 16.2% |
Arizona | 2 | 9.8% | 7.02198 | Feb-10 | 16.2% | 15.7% |
Arizona | 3 | 9.4% | 7.42536 | Mar-10 | 16.2% | 15.6% |
Arizona | 4 | 9.7% | 7.04054 | Apr-10 | 16.2% | 14.2% |
Arizona | 5 | 9.7% | 6.88244 | May-10 | 16.2% | 14.1% |
Do NOT call LAG() function conditionally. Search for the thousands of threads that explain how LAG() works.
For your example you might be able to solve this by putting the IF FIRST..... statement AFTER the one that uses the lagged values.
Modeled_Vacancy = lag(Modeled_Vacancy) + &intercept. + (&unemployment. *lag(urate)) + (&permits. *lag(bp));
if first.state then Modeled_Vacancy = Initial_Vrate;
Not sure about how LAG() works for a calculated variable. You might want to just RETAIN the value instead.
retain modeled_vacancy ;
Modeled_Vacancy = Modeled_Vacancy + &intercept. + (&unemployment. *lag(urate)) + (&permits. *lag(bp));
if first.state then Modeled_Vacancy = Initial_Vrate;
Following Tom's idea, while using ifn(), which unconditionally lagging:
data want;
set have;
by state;
retain Modeled_Vacancy;
Modeled_Vacancy=ifn(first.state, Initial_Vrate, Modeled_Vacancy + &intercept. + (&unemployment. *lag(urate)) + (&permits. *lag(bp)));
format Modeled_Vacancy percent8.1;
run;
Haikuo
Hi,
Mike explained using Lag Function with condition in the link below:
to answer the original posting, reverse the order of these two lines
if first.state then Modeled_Vacancy = Initial_Vrate;
else Modeled_Vacancy = lag(Modeled_Vacancy) + &intercept. + (&unemployment. *lag(urate)) + (&permits. *lag(bp));
replace with
Modeled_Vacancy = lag(Modeled_Vacancy) + &intercept. + (&unemployment. *lag(urate)) + (&permits. *lag(bp));
if first.state then Modeled_Vacancy = Initial_Vrate;
that seems the simplest change that overcomes the LAG() problem
(notice there is now no ELSE so the LAG()s become "unconditional")
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.