Hello,
I have an UNBALANCED PANEL of clients (i.e. not all clients are present on every year) for four years (2010-2013). I would like to calculate the change in costs over the years. The problem is that when I lag the variable "costs", SAS lags over all client numbers (i.e. 2013 of client1 will become 2010 for client2). How do I prevent this from happening?
I can't just take out year=2010, because for some clients 2011 was the first year.
This is what I have so far:
data costchange;
set sample;
by cient;
lcost=lag(cost);
cost_score = (cost - lcost)/lcost;
where treated=1;
run;
When using LAG, it's important to make sure it executes on every observation. You're doing that ... so far so good.
The trick is to wipe out the value for observations that begin a new CLIENT. So:
data costchange;
set sample;
by cient;
lcost=lag(cost);
if first.client then lcost = . ;
cost_score = (cost - lcost)/lcost;
where treated=1;
run;
As always, some test data in the form of a datastep would help clarify what you mean. At a guess, maybe use retain'd variables and set them once per group, then when missing use the retained variable.
When using LAG, it's important to make sure it executes on every observation. You're doing that ... so far so good.
The trick is to wipe out the value for observations that begin a new CLIENT. So:
data costchange;
set sample;
by cient;
lcost=lag(cost);
if first.client then lcost = . ;
cost_score = (cost - lcost)/lcost;
where treated=1;
run;
Perfect! Just what I was looking for.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.