BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi All,

I have a dataset in the following format:
year bank_ID cash_balance

2000 1 0
2000 2 0
2000 3 0
2000 4 0
2001 1 10
2001 2 20
2001 3 0
2001 4 5
2002 1 15
2002 2 22
2002 3 5
2002 4 7
.
.
.
2009.......

Now, I would like to create three variables, time_1, time_2, time_3. These three time variables represent presence or absense (0/1) of a bank balance of >$0 for a particular bankID in the previous year, in two years previous, and three year previous.

Example 1. When SAS is evaluating year 2000, then it should create variables temp_1, temp_2, and temp_3 based on bank balances for the unique bankIDs during years 1999, 1998, and 1997. Since data for years 1997-1999 are missing, temp_1, temp_2, and temp_3 values for year 2000 should have missing values. In case of year 2001, data for years 1998 and 1999 are missing, so temp_2 and temp_3 values for year 2001 should be set to missing.

Example 2. For year 2003 and bankID 1, bank balance for year 2002 (temp_1) is 15, so temp_1 would be 1, however for 2000 (temp_3) the bank_balance is 0, so value of temp_3 would be 0.

Any suggestions regarding an appropriate SAS program to process this data will be highly appreciated. Thank you.
5 REPLIES 5
GertNissen
Barite | Level 11
[pre]data input;
input year bank_ID cash_balance;
datalines;
2000 1 0
2000 2 0
2000 3 0
2000 4 0
2001 1 10
2001 2 20
2001 3 0
2001 4 5
2002 1 15
2002 2 22
2002 3 5
2002 4 7
2003 1 0
2003 2 0
2003 3 0
2003 4 0
2004 1 10
2004 2 20
2004 3 0
2004 4 5
2005 1 15
2005 2 22
2005 3 5
2005 4 7
;
run;

proc sort data=input;
by bank_id year;
run;

data lag;
set input;
time_1 = max(lag1(cash_balance),0);
time_2 = max(lag2(cash_balance),0);
time_3 = max(lag3(cash_balance),0);
run;[/pre]
Patrick
Opal | Level 21
Looking at your data I think Geniz solution comes only close to what you asked for.

May be this might help (not tested):

proc format;
value bool
>0 =1
;
run;

data lag;
set input;
time_1 = input(put(lag1(cash_balance),bool.),8.);
time_1 = input(put(lag2(cash_balance),bool.),8.);
time_1 = input(put(lag3(cash_balance),bool.),8.);
run;
deleted_user
Not applicable
Thanks Geniz and Patrick. I will work on your suggestions and get back to the forum very soon. Any alternate suggestions from other forum members are most welcome.
Patrick
Opal | Level 21
Tss, tss... and what both solutions where missing so far is what I included now in the end of the code.
Not very elegant - but it should work.

data input;
input year bank_ID cash_balance;
datalines;
2000 1 0
2000 2 0
2000 3 0
2000 4 0
2001 1 10
2001 2 20
2001 3 0
2001 4 5
2002 1 15
2002 2 22
2002 3 5
2002 4 7
2003 1 0
2003 2 0
2003 3 0
2003 4 0
2004 1 10
2004 2 20
2004 3 0
2004 4 5
2005 1 15
2005 2 22
2005 3 5
2005 4 7
;
run;

proc sort data=input;
by bank_id year;
run;

proc format;
value bool
>0 =1
;
run;

data lag;
set input;
by bank_id year;
time_1 = input(put(lag1(cash_balance),bool.),8.);
time_2 = input(put(lag2(cash_balance),bool.),8.);
time_3 = input(put(lag3(cash_balance),bool.),8.);

if bank_id ne lag1(bank_id) then time_1=.;
if bank_id ne lag2(bank_id) then time_2=.;
if bank_id ne lag3(bank_id) then time_3=.;
run; Message was edited by: Patrick
data_null__
Jade | Level 19
Normalizing Boolean and initializing LAGS at BY group breaks.

Using the technique of this sample program http://support.sas.com/kb/24/694.html and using a double negative to normalize the flags I suggest this small modification to your program.

[pre]
data lag;
set input;
by bank_id year;
array time_[3];
time_1 = not not lag1(cash_balance);
time_2 = not not lag2(cash_balance);
time_3 = not not lag3(cash_balance);
if first.bank_id then count = 0;
count + 1;
do _n_ = count to dim(time_);
time_[_n_] = .;
end;
drop count;
run;
proc print;
by bank_id;
id bank_id;
run;

cash_
bank_ID year balance time_1 time_2 time_3

1 2000 0 . . .
2001 10 0 . .
2002 15 1 0 .
2003 0 1 1 0
2004 10 0 1 1
2005 15 1 0 1


2 2000 0 . . .
2001 20 0 . .
2002 22 1 0 .
2003 0 1 1 0
2004 20 0 1 1
2005 22 1 0 1


3 2000 0 . . .
2001 0 0 . .
2002 5 0 0 .
2003 0 1 0 0
2004 0 0 1 0
2005 5 0 0 1


4 2000 0 . . .
2001 5 0 . .
2002 7 1 0 .
2003 0 1 1 0
2004 5 0 1 1
2005 7 1 0 1
[/pre]

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 764 views
  • 0 likes
  • 4 in conversation