I would like to count the number of observation (VOL) by looking backward (window 3-year) and by FIRMID. For example, SAS starts counting the number of observation in 2002 (first 3 years) and move 3-year backward until the end of period (2005). Please see a sample below:
Thank you very much
HAVE
FIRMID YEAR VOL
1 2000 100
1 2001 200
1 2002 250
1 2003 .
1 2004 .
1 2005 300
2 2000 500
2 2001 850
2 2002 350
2 2003 600
2 2004 650
2 2005 .
WANT
FIRMID YEAR VOL VOL_N
1 2000 100 .
1 2001 200 .
1 2002 250 3
1 2003 . 2
1 2004 . 1
1 2005 300 1
2 2000 500 .
2 2001 850 .
2 2002 350 3
2 2003 600 3
2 2004 650 3
2 2005 . 2
An easy way is to put the older values into an array of length 3 and count how many are non-missing.
data want ;
set have ;
by firmid;
array lags [0:2] _temporary_;
if first.firmid then call missing(of lags[*] row);
row+1;
lags[mod(row,3)]=vol;
if row >= 3 then vol_n=n(of lags[*]);
run;
Result:
Obs FIRMID YEAR VOL row vol_n 1 1 2000 100 1 . 2 1 2001 200 2 . 3 1 2002 250 3 3 4 1 2003 . 4 2 5 1 2004 . 5 1 6 1 2005 300 6 1 7 2 2000 500 1 . 8 2 2001 850 2 . 9 2 2002 350 3 3 10 2 2003 600 4 3 11 2 2004 650 5 3 12 2 2005 . 6 2
Is this your rule:
?
I think you want to do something like this
data have;
input FIRMID YEAR VOL;
datalines;
1 2000 100
1 2001 200
1 2002 250
1 2003 .
1 2004 .
1 2005 300
2 2000 500
2 2001 850
2 2002 350
2 2003 600
2 2004 650
2 2005 .
;
proc sql;
create table want as
select *,
(select n(VOL) from have
where FIRMID=a.FIRMID
and a.YEAR-2 le YEAR le a.YEAR)
as VOL_N
from have as a;
quit;
Thanks for your help draycut
An easy way is to put the older values into an array of length 3 and count how many are non-missing.
data want ;
set have ;
by firmid;
array lags [0:2] _temporary_;
if first.firmid then call missing(of lags[*] row);
row+1;
lags[mod(row,3)]=vol;
if row >= 3 then vol_n=n(of lags[*]);
run;
Result:
Obs FIRMID YEAR VOL row vol_n 1 1 2000 100 1 . 2 1 2001 200 2 . 3 1 2002 250 3 3 4 1 2003 . 4 2 5 1 2004 . 5 1 6 1 2005 300 6 1 7 2 2000 500 1 . 8 2 2001 850 2 . 9 2 2002 350 3 3 10 2 2003 600 4 3 11 2 2004 650 5 3 12 2 2005 . 6 2
Thank you very much Tom. Your code is very efficient
Tom,
I also consider this way before. But it there is a gap in data,your code would not work.
1 2000 100
1 2001 200
1 2002 250
1 2003 .
1 2004 . /*<- if this obs was missing */
1 2005 300
2 2000 500
2 2001 850
2 2002 350
2 2003 600
2 2004 650
2 2005 .
@Ksharp wrote:
Tom,
I also consider this way before. But it there is a gap in data,your code would not work.
1 2000 100 1 2001 200 1 2002 250 1 2003 . 1 2004 . /*<- if this obs was missing */ 1 2005 300 2 2000 500 2 2001 850 2 2002 350 2 2003 600 2 2004 650 2 2005 .
If by gap in the data you mean there are some years that are not even present in the data (instead of just having a missing value of the variable of interest) and that the criteria is to count by YEARS instead of by observations then yes.
This code:
data have;
input FIRMID $ YEAR VOL;
datalines;
1 2000 100
1 2001 200
1 2002 250
1 2003 .
1 2004 .
1 2005 300
2 2000 500
2 2001 850
2 2002 350
2 2003 600
2 2004 650
2 2005 .
;
data want;
set have;
by firmid;
retain counter vol_n;
if first.firmid
then do;
counter = 1;
vol_n = .;
end;
else counter + 1;
if counter = 3
then vol_n = 3;
if vol_n > 1 and vol = . then vol_n + (-1);
drop counter;
run;
proc print data=want noobs;
run;
produces your expected output:
FIRMID YEAR VOL vol_n 1 2000 100 . 1 2001 200 . 1 2002 250 3 1 2003 . 2 1 2004 . 1 1 2005 300 1 2 2000 500 . 2 2001 850 . 2 2002 350 3 2 2003 600 3 2 2004 650 3 2 2005 . 2
data have;
input FIRMID YEAR VOL;
datalines;
1 2000 100
1 2001 200
1 2002 250
1 2003 .
1 2004 .
1 2005 300
2 2000 500
2 2001 850
2 2002 350
2 2003 600
2 2004 650
2 2005 .
;
proc sql;
create table want as
select *,case when year<min(year)+2 then .
else
(select n(vol) from have
where FIRMID=a.FIRMID
and a.YEAR-2 le YEAR le a.YEAR)
end as VOL_N
from have as a
group by FIRMID
order by FIRMID,year;
quit;
Thank a lot Ksharp.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.