I would like to count the number of observation (VOL) by looking backward (window 3-year) and by FIRMID. For example, SAS starts counting the number of observation in 2002 (first 3 years) and move 3-year backward until the end of period (2005). Please see a sample below:
Thank you very much
HAVE
FIRMID YEAR VOL
1 2000 100
1 2001 200
1 2002 250
1 2003 .
1 2004 .
1 2005 300
2 2000 500
2 2001 850
2 2002 350
2 2003 600
2 2004 650
2 2005 .
WANT
FIRMID YEAR VOL VOL_N
1 2000 100 .
1 2001 200 .
1 2002 250 3
1 2003 . 2
1 2004 . 1
1 2005 300 1
2 2000 500 .
2 2001 850 .
2 2002 350 3
2 2003 600 3
2 2004 650 3
2 2005 . 2
An easy way is to put the older values into an array of length 3 and count how many are non-missing.
data want ;
set have ;
by firmid;
array lags [0:2] _temporary_;
if first.firmid then call missing(of lags[*] row);
row+1;
lags[mod(row,3)]=vol;
if row >= 3 then vol_n=n(of lags[*]);
run;
Result:
Obs FIRMID YEAR VOL row vol_n 1 1 2000 100 1 . 2 1 2001 200 2 . 3 1 2002 250 3 3 4 1 2003 . 4 2 5 1 2004 . 5 1 6 1 2005 300 6 1 7 2 2000 500 1 . 8 2 2001 850 2 . 9 2 2002 350 3 3 10 2 2003 600 4 3 11 2 2004 650 5 3 12 2 2005 . 6 2
Is this your rule:
?
I think you want to do something like this
data have;
input FIRMID YEAR VOL;
datalines;
1 2000 100
1 2001 200
1 2002 250
1 2003 .
1 2004 .
1 2005 300
2 2000 500
2 2001 850
2 2002 350
2 2003 600
2 2004 650
2 2005 .
;
proc sql;
create table want as
select *,
(select n(VOL) from have
where FIRMID=a.FIRMID
and a.YEAR-2 le YEAR le a.YEAR)
as VOL_N
from have as a;
quit;
Thanks for your help draycut
An easy way is to put the older values into an array of length 3 and count how many are non-missing.
data want ;
set have ;
by firmid;
array lags [0:2] _temporary_;
if first.firmid then call missing(of lags[*] row);
row+1;
lags[mod(row,3)]=vol;
if row >= 3 then vol_n=n(of lags[*]);
run;
Result:
Obs FIRMID YEAR VOL row vol_n 1 1 2000 100 1 . 2 1 2001 200 2 . 3 1 2002 250 3 3 4 1 2003 . 4 2 5 1 2004 . 5 1 6 1 2005 300 6 1 7 2 2000 500 1 . 8 2 2001 850 2 . 9 2 2002 350 3 3 10 2 2003 600 4 3 11 2 2004 650 5 3 12 2 2005 . 6 2
Thank you very much Tom. Your code is very efficient
Tom,
I also consider this way before. But it there is a gap in data,your code would not work.
1 2000 100
1 2001 200
1 2002 250
1 2003 .
1 2004 . /*<- if this obs was missing */
1 2005 300
2 2000 500
2 2001 850
2 2002 350
2 2003 600
2 2004 650
2 2005 .
@Ksharp wrote:
Tom,
I also consider this way before. But it there is a gap in data,your code would not work.
1 2000 100 1 2001 200 1 2002 250 1 2003 . 1 2004 . /*<- if this obs was missing */ 1 2005 300 2 2000 500 2 2001 850 2 2002 350 2 2003 600 2 2004 650 2 2005 .
If by gap in the data you mean there are some years that are not even present in the data (instead of just having a missing value of the variable of interest) and that the criteria is to count by YEARS instead of by observations then yes.
This code:
data have;
input FIRMID $ YEAR VOL;
datalines;
1 2000 100
1 2001 200
1 2002 250
1 2003 .
1 2004 .
1 2005 300
2 2000 500
2 2001 850
2 2002 350
2 2003 600
2 2004 650
2 2005 .
;
data want;
set have;
by firmid;
retain counter vol_n;
if first.firmid
then do;
counter = 1;
vol_n = .;
end;
else counter + 1;
if counter = 3
then vol_n = 3;
if vol_n > 1 and vol = . then vol_n + (-1);
drop counter;
run;
proc print data=want noobs;
run;
produces your expected output:
FIRMID YEAR VOL vol_n 1 2000 100 . 1 2001 200 . 1 2002 250 3 1 2003 . 2 1 2004 . 1 1 2005 300 1 2 2000 500 . 2 2001 850 . 2 2002 350 3 2 2003 600 3 2 2004 650 3 2 2005 . 2
data have;
input FIRMID YEAR VOL;
datalines;
1 2000 100
1 2001 200
1 2002 250
1 2003 .
1 2004 .
1 2005 300
2 2000 500
2 2001 850
2 2002 350
2 2003 600
2 2004 650
2 2005 .
;
proc sql;
create table want as
select *,case when year<min(year)+2 then .
else
(select n(vol) from have
where FIRMID=a.FIRMID
and a.YEAR-2 le YEAR le a.YEAR)
end as VOL_N
from have as a
group by FIRMID
order by FIRMID,year;
quit;
Thank a lot Ksharp.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.