Dear altruists,
I am a new SAS user.
I have the following dataset. I have also attached the SAS file for your kind consideration.
Company_ID | Announcement |
1001 | 2020-10-14 |
1001 | 2020-11-15 |
1001 | 2020-12-18 |
1001 | 2021-01-01 |
1001 | 2021-02-04 |
1001 | 2021-03-05 |
1001 | 2021-04-06 |
1001 | 2021-05-07 |
1001 | 2021-06-08 |
1001 | 2021-07-11 |
1001 | 2021-08-12 |
1001 | 2021-09-13 |
1001 | 2021-10-14 |
1001 | 2021-11-15 |
1001 | 2021-12-18 |
1002 | 2020-10-14 |
1002 | 2020-11-15 |
1002 | 2020-12-18 |
1002 | 2021-03-05 |
1002 | 2021-04-06 |
1002 | 2021-05-07 |
1002 | 2021-06-08 |
1002 | 2021-07-11 |
1002 | 2021-08-12 |
1002 | 2021-11-15 |
1002 | 2021-12-18 |
I want to count the number of observations by group in the preceding 12 calendar months. Please note that I am interested in counting not the observations in the previous 12 rows, but in the preceding 12 calendar months.
I am looking for the following output. As you will notice, the last row for Company_ID 1002 generates a value of 9.
Company_ID | Announcement | Prior_12_Month_Ann |
1001 | 2020-10-14 | 0 |
1001 | 2020-11-15 | 1 |
1001 | 2020-12-18 | 2 |
1001 | 2021-01-01 | 3 |
1001 | 2021-02-04 | 4 |
1001 | 2021-03-05 | 5 |
1001 | 2021-04-06 | 6 |
1001 | 2021-05-07 | 7 |
1001 | 2021-06-08 | 8 |
1001 | 2021-07-11 | 9 |
1001 | 2021-08-12 | 10 |
1001 | 2021-09-13 | 11 |
1001 | 2021-10-14 | 12 |
1001 | 2021-11-15 | 12 |
1001 | 2021-12-18 | 12 |
1002 | 2020-10-14 | 0 |
1002 | 2020-11-15 | 1 |
1002 | 2020-12-18 | 2 |
1002 | 2021-03-05 | 3 |
1002 | 2021-04-06 | 4 |
1002 | 2021-05-07 | 5 |
1002 | 2021-06-08 | 6 |
1002 | 2021-07-11 | 7 |
1002 | 2021-08-12 | 8 |
1002 | 2021-11-15 | 9 |
1002 | 2021-12-18 | 9 |
Thank you in advance for your kind support!
data have;
input Company_ID Announcement : yymmdd10.;
format Announcement yymmdd10.;
cards;
1001 2020-10-14
1001 2020-11-15
1001 2020-12-18
1001 2021-01-01
1001 2021-02-04
1001 2021-03-05
1001 2021-04-06
1001 2021-05-07
1001 2021-06-08
1001 2021-07-11
1001 2021-08-12
1001 2021-09-13
1001 2021-10-14
1001 2021-11-15
1001 2021-12-18
1002 2020-10-14
1002 2020-11-15
1002 2020-12-18
1002 2021-03-05
1002 2021-04-06
1002 2021-05-07
1002 2021-06-08
1002 2021-07-11
1002 2021-08-12
1002 2021-11-15
1002 2021-12-18
;
proc sql;
create table want as
select *,(select count(*) from have where Company_ID=a.Company_ID and
Announcement between intnx('month',a.Announcement,-13)
and intnx('month',a.Announcement,-1,'e')) as count
from have as a;
quit;
Will there always be one and only one observation per month/company_id, or do you need to condense or fill that first?
A different take, using a data step and a hash object:
data want;
set have;
by company_id;
if _n_ = 1
then do;
length month number 8;
declare hash sum ();
sum.definekey("month");
sum.definedata("number");
sum.definedone();
end;
if first.company_id then rc = sum.clear();
month = intnx('month',announcement,0,"b");
rc = sum.find();
number = sum(number,1);
if rc
then rc = sum.add();
else rc = sum.replace();
month = intnx('month',announcement,-12,"b");
do until (month ge intnx('month',announcement,0,"b"));
if sum.find() = 0
then Prior_12_Month_Ann = sum(Prior_12_Month_Ann,number);
month = intnx("month",month,1,"b");
end;
drop rc number month;
run;
Since the code counts per calendar month, not per "month counted backwards from the current date", it creates slightly different numbers.
OTOH, it will be faster than SQL with larger datasets, as it avoids the sub-select.
data have;
input Company_ID Announcement : yymmdd10.;
format Announcement yymmdd10.;
cards;
1001 2020-10-14
1001 2020-11-15
1001 2020-12-18
1001 2021-01-01
1001 2021-02-04
1001 2021-03-05
1001 2021-04-06
1001 2021-05-07
1001 2021-06-08
1001 2021-07-11
1001 2021-08-12
1001 2021-09-13
1001 2021-10-14
1001 2021-11-15
1001 2021-12-18
1002 2020-10-14
1002 2020-11-15
1002 2020-12-18
1002 2021-03-05
1002 2021-04-06
1002 2021-05-07
1002 2021-06-08
1002 2021-07-11
1002 2021-08-12
1002 2021-11-15
1002 2021-12-18
;
proc sql;
create table want as
select *,(select count(*) from have where Company_ID=a.Company_ID and
Announcement between intnx('month',a.Announcement,-13)
and intnx('month',a.Announcement,-1,'e')) as count
from have as a;
quit;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.