New SAS User

mmh · Posted 11-28-2021 12:57 AM

Dear altruists,
I am a new SAS user.
I have the following dataset. I have also attached the SAS file for your kind consideration.

Company_ID	Announcement
1001	2020-10-14
1001	2020-11-15
1001	2020-12-18
1001	2021-01-01
1001	2021-02-04
1001	2021-03-05
1001	2021-04-06
1001	2021-05-07
1001	2021-06-08
1001	2021-07-11
1001	2021-08-12
1001	2021-09-13
1001	2021-10-14
1001	2021-11-15
1001	2021-12-18
1002	2020-10-14
1002	2020-11-15
1002	2020-12-18
1002	2021-03-05
1002	2021-04-06
1002	2021-05-07
1002	2021-06-08
1002	2021-07-11
1002	2021-08-12
1002	2021-11-15
1002	2021-12-18

I want to count the number of observations by group in the preceding 12 calendar months. Please note that I am interested in counting not the observations in the previous 12 rows, but in the preceding 12 calendar months.
I am looking for the following output. As you will notice, the last row for Company_ID 1002 generates a value of 9.

Company_ID	Announcement	Prior_12_Month_Ann
1001	2020-10-14	0
1001	2020-11-15	1
1001	2020-12-18	2
1001	2021-01-01	3
1001	2021-02-04	4
1001	2021-03-05	5
1001	2021-04-06	6
1001	2021-05-07	7
1001	2021-06-08	8
1001	2021-07-11	9
1001	2021-08-12	10
1001	2021-09-13	11
1001	2021-10-14	12
1001	2021-11-15	12
1001	2021-12-18	12
1002	2020-10-14	0
1002	2020-11-15	1
1002	2020-12-18	2
1002	2021-03-05	3
1002	2021-04-06	4
1002	2021-05-07	5
1002	2021-06-08	6
1002	2021-07-11	7
1002	2021-08-12	8
1002	2021-11-15	9
1002	2021-12-18	9

Thank you in advance for your kind support!

Ksharp · Posted 11-28-2021 05:13 AM

data have;
input Company_ID	Announcement : yymmdd10.;
format Announcement  yymmdd10.;
cards;
1001	2020-10-14
1001	2020-11-15
1001	2020-12-18
1001	2021-01-01
1001	2021-02-04
1001	2021-03-05
1001	2021-04-06
1001	2021-05-07
1001	2021-06-08
1001	2021-07-11
1001	2021-08-12
1001	2021-09-13
1001	2021-10-14
1001	2021-11-15
1001	2021-12-18
1002	2020-10-14
1002	2020-11-15
1002	2020-12-18
1002	2021-03-05
1002	2021-04-06
1002	2021-05-07
1002	2021-06-08
1002	2021-07-11
1002	2021-08-12
1002	2021-11-15
1002	2021-12-18
;
proc sql;
create table want as
select *,(select count(*) from have where Company_ID=a.Company_ID and
 Announcement between intnx('month',a.Announcement,-13) 
 and  intnx('month',a.Announcement,-1,'e')) as count
 from have as a;
quit;

View solution in original post

Kurt_Bremser · Posted 11-28-2021 04:15 AM

Will there always be one and only one observation per month/company_id, or do you need to condense or fill that first?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

mmh · Posted 11-28-2021 04:40 AM

Hello Kurt,
No, there might be instances where there are more than one observation per month/company_id.
I should have clarified that previously in my sample dataset.
Thanks a lot for asking!

mmh · Posted 11-28-2021 06:14 AM

Hello Kurt,
I got the solution, thanks to Ksharp.
Thank you so much for looking in to my problem!

Kurt_Bremser · Posted 11-28-2021 09:29 AM

A different take, using a data step and a hash object:

data want;
set have;
by company_id;
if _n_ = 1
then do;
  length month number 8;
  declare hash sum ();
  sum.definekey("month");
  sum.definedata("number");
  sum.definedone();
end;
if first.company_id then rc = sum.clear();
month = intnx('month',announcement,0,"b");
rc = sum.find();
number = sum(number,1);
if rc
then rc = sum.add();
else rc = sum.replace();
month = intnx('month',announcement,-12,"b");
do until (month ge intnx('month',announcement,0,"b"));
  if sum.find() = 0
  then Prior_12_Month_Ann = sum(Prior_12_Month_Ann,number);
  month = intnx("month",month,1,"b");
end;
drop rc number month;
run;

Since the code counts per calendar month, not per "month counted backwards from the current date", it creates slightly different numbers.

OTOH, it will be faster than SQL with larger datasets, as it avoids the sub-select.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Ksharp · Posted 11-28-2021 05:13 AM

data have;
input Company_ID	Announcement : yymmdd10.;
format Announcement  yymmdd10.;
cards;
1001	2020-10-14
1001	2020-11-15
1001	2020-12-18
1001	2021-01-01
1001	2021-02-04
1001	2021-03-05
1001	2021-04-06
1001	2021-05-07
1001	2021-06-08
1001	2021-07-11
1001	2021-08-12
1001	2021-09-13
1001	2021-10-14
1001	2021-11-15
1001	2021-12-18
1002	2020-10-14
1002	2020-11-15
1002	2020-12-18
1002	2021-03-05
1002	2021-04-06
1002	2021-05-07
1002	2021-06-08
1002	2021-07-11
1002	2021-08-12
1002	2021-11-15
1002	2021-12-18
;
proc sql;
create table want as
select *,(select count(*) from have where Company_ID=a.Company_ID and
 Announcement between intnx('month',a.Announcement,-13) 
 and  intnx('month',a.Announcement,-1,'e')) as count
 from have as a;
quit;

mmh · Posted 11-28-2021 06:09 AM

Hello KSharp,
Thank you so much for your solution!
I just needed to make a small change in the following line and it perfectly worked!

Announcement between intnx('month',a.Announcement,-12)

Thanks again for your kind support!

New SAS User

Counting the number of observations by group in the preceding 12 calendar months

Re: Counting the number of observations by group in the preceding 12 calendar months

Re: Counting the number of observations by group in the preceding 12 calendar months

Re: Counting the number of observations by group in the preceding 12 calendar months

Re: Counting the number of observations by group in the preceding 12 calendar months

Re: Counting the number of observations by group in the preceding 12 calendar months

Re: Counting the number of observations by group in the preceding 12 calendar months

Re: Counting the number of observations by group in the preceding 12 calendar months

Follow Us

What is...

New SAS User

Our biggest data and AI event of the year.

Follow Us

What is...