BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mmh
Obsidian | Level 7 mmh
Obsidian | Level 7

Dear altruists,
I am a new SAS user.
I have the following dataset. I have also attached the SAS file for your kind consideration.

Company_IDAnnouncement
10012020-10-14
10012020-11-15
10012020-12-18
10012021-01-01
10012021-02-04
10012021-03-05
10012021-04-06
10012021-05-07
10012021-06-08
10012021-07-11
10012021-08-12
10012021-09-13
10012021-10-14
10012021-11-15
10012021-12-18
10022020-10-14
10022020-11-15
10022020-12-18
10022021-03-05
10022021-04-06
10022021-05-07
10022021-06-08
10022021-07-11
10022021-08-12
10022021-11-15
10022021-12-18

 

I want to count the number of observations by group in the preceding 12 calendar months. Please note that I am interested in counting not the observations in the previous 12 rows, but in the preceding 12 calendar months.
I am looking for the following output. As you will notice, the last row for Company_ID 1002 generates a value of 9.

Company_IDAnnouncementPrior_12_Month_Ann
10012020-10-140
10012020-11-151
10012020-12-182
10012021-01-013
10012021-02-044
10012021-03-055
10012021-04-066
10012021-05-077
10012021-06-088
10012021-07-119
10012021-08-1210
10012021-09-1311
10012021-10-1412
10012021-11-1512
10012021-12-1812
10022020-10-140
10022020-11-151
10022020-12-182
10022021-03-053
10022021-04-064
10022021-05-075
10022021-06-086
10022021-07-117
10022021-08-128
10022021-11-159
10022021-12-189

 

Thank you in advance for your kind support!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
input Company_ID	Announcement : yymmdd10.;
format Announcement  yymmdd10.;
cards;
1001	2020-10-14
1001	2020-11-15
1001	2020-12-18
1001	2021-01-01
1001	2021-02-04
1001	2021-03-05
1001	2021-04-06
1001	2021-05-07
1001	2021-06-08
1001	2021-07-11
1001	2021-08-12
1001	2021-09-13
1001	2021-10-14
1001	2021-11-15
1001	2021-12-18
1002	2020-10-14
1002	2020-11-15
1002	2020-12-18
1002	2021-03-05
1002	2021-04-06
1002	2021-05-07
1002	2021-06-08
1002	2021-07-11
1002	2021-08-12
1002	2021-11-15
1002	2021-12-18
;
proc sql;
create table want as
select *,(select count(*) from have where Company_ID=a.Company_ID and
 Announcement between intnx('month',a.Announcement,-13) 
 and  intnx('month',a.Announcement,-1,'e')) as count
 from have as a;
quit;

View solution in original post

6 REPLIES 6
mmh
Obsidian | Level 7 mmh
Obsidian | Level 7
Hello Kurt,
No, there might be instances where there are more than one observation per month/company_id.
I should have clarified that previously in my sample dataset.
Thanks a lot for asking!
mmh
Obsidian | Level 7 mmh
Obsidian | Level 7
Hello Kurt,
I got the solution, thanks to Ksharp.
Thank you so much for looking in to my problem!
Kurt_Bremser
Super User

A different take, using a data step and a hash object:

data want;
set have;
by company_id;
if _n_ = 1
then do;
  length month number 8;
  declare hash sum ();
  sum.definekey("month");
  sum.definedata("number");
  sum.definedone();
end;
if first.company_id then rc = sum.clear();
month = intnx('month',announcement,0,"b");
rc = sum.find();
number = sum(number,1);
if rc
then rc = sum.add();
else rc = sum.replace();
month = intnx('month',announcement,-12,"b");
do until (month ge intnx('month',announcement,0,"b"));
  if sum.find() = 0
  then Prior_12_Month_Ann = sum(Prior_12_Month_Ann,number);
  month = intnx("month",month,1,"b");
end;
drop rc number month;
run;

Since the code counts per calendar month, not per "month counted backwards from the current date", it creates slightly different numbers.

OTOH, it will be faster than SQL with larger datasets, as it avoids the sub-select.

Ksharp
Super User
data have;
input Company_ID	Announcement : yymmdd10.;
format Announcement  yymmdd10.;
cards;
1001	2020-10-14
1001	2020-11-15
1001	2020-12-18
1001	2021-01-01
1001	2021-02-04
1001	2021-03-05
1001	2021-04-06
1001	2021-05-07
1001	2021-06-08
1001	2021-07-11
1001	2021-08-12
1001	2021-09-13
1001	2021-10-14
1001	2021-11-15
1001	2021-12-18
1002	2020-10-14
1002	2020-11-15
1002	2020-12-18
1002	2021-03-05
1002	2021-04-06
1002	2021-05-07
1002	2021-06-08
1002	2021-07-11
1002	2021-08-12
1002	2021-11-15
1002	2021-12-18
;
proc sql;
create table want as
select *,(select count(*) from have where Company_ID=a.Company_ID and
 Announcement between intnx('month',a.Announcement,-13) 
 and  intnx('month',a.Announcement,-1,'e')) as count
 from have as a;
quit;
mmh
Obsidian | Level 7 mmh
Obsidian | Level 7
Hello KSharp,
Thank you so much for your solution!
I just needed to make a small change in the following line and it perfectly worked!

Announcement between intnx('month',a.Announcement,-12)

Thanks again for your kind support!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 857 views
  • 2 likes
  • 3 in conversation