Hello,
I have an UNBALANCED PANEL of clients (i.e. not all clients are present on every year) for four years (2010-2013). I would like to calculate the change in costs over the years. The problem is that when I lag the variable "costs", SAS lags over all client numbers (i.e. 2013 of client1 will become 2010 for client2). How do I prevent this from happening?
I can't just take out year=2010, because for some clients 2011 was the first year.
This is what I have so far:
data costchange;
set sample;
by cient;
lcost=lag(cost);
cost_score = (cost - lcost)/lcost;
where treated=1;
run;
When using LAG, it's important to make sure it executes on every observation. You're doing that ... so far so good.
The trick is to wipe out the value for observations that begin a new CLIENT. So:
data costchange;
set sample;
by cient;
lcost=lag(cost);
if first.client then lcost = . ;
cost_score = (cost - lcost)/lcost;
where treated=1;
run;
As always, some test data in the form of a datastep would help clarify what you mean. At a guess, maybe use retain'd variables and set them once per group, then when missing use the retained variable.
When using LAG, it's important to make sure it executes on every observation. You're doing that ... so far so good.
The trick is to wipe out the value for observations that begin a new CLIENT. So:
data costchange;
set sample;
by cient;
lcost=lag(cost);
if first.client then lcost = . ;
cost_score = (cost - lcost)/lcost;
where treated=1;
run;
Perfect! Just what I was looking for.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.