Dear altruist,
I have the following SAS dataset:
data have;
input Company_ID$ Fiscal_Year Fiscal_Quarter Announcement : yymmdd10.;
format Announcement yymmdd10.;
cards;
1001 2001 2 2001-08-31
1001 2001 3 2001-11-30
1001 2001 4 2002-02-28
1001 2002 1 2002-05-31
1001 2002 2 2002-08-31
1001 2002 3 2002-11-30
1001 2002 4 2003-02-28
1001 2003 1 2003-05-31
1001 2003 2 2003-08-31
1001 2003 3 2003-11-30
1001 2003 4 2004-02-29
1001 2004 1 2004-05-31
1001 2004 2 2004-08-31
2002 2001 3 2001-07-31
2002 2001 4 2001-10-31
2002 2002 1 2002-01-31
2002 2002 2 2002-04-30
2002 2002 3 2002-07-31
2002 2002 4 2002-10-31
2002 2003 1 2003-01-01
2002 2003 2 2003-04-30
2002 2003 3 2003-07-31
2002 2003 4 2003-10-31
2002 2004 1 2004-01-31
3003 2001 4 2001-10-31
3003 2002 1 2002-01-31
4004 2004 4 2004-12-31
4004 2005 1 2005-04-02
5005 2001 1 2001-01-31
;
run;
I want to know what is the total number of observations per Fiscal_Quarter based on the the variable Announcement_Date which is Before and On-or-after 01JAN2003.
In the output file, every company will have the all the four fiscal quarters calculated, even if a company only has one observation (Please see Company_ID 5005). Thus, the Fiscal_Quarters 1-4 is displayed for all companies.
Essentially, I am expecting the following output:
Company_ID | Fiscal_Quarter | Before_01JAN2003 | On_or_after_01JAN2003 |
1001 | 1 | 1 | 2 |
1001 | 2 | 2 | 2 |
1001 | 3 | 2 | 1 |
1001 | 4 | 1 | 2 |
2002 | 1 | 1 | 2 |
2002 | 2 | 1 | 1 |
2002 | 3 | 2 | 1 |
2002 | 4 | 2 | 1 |
3003 | 1 | 1 | 0 |
3003 | 2 | 0 | 0 |
3003 | 3 | 0 | 0 |
3003 | 4 | 1 | 0 |
4004 | 1 | 0 | 1 |
4004 | 2 | 0 | 0 |
4004 | 3 | 0 | 0 |
4004 | 4 | 0 | 1 |
5005 | 1 | 1 | 0 |
5005 | 2 | 0 | 0 |
5005 | 3 | 0 | 0 |
5005 | 4 | 0 | 0 |
I thank you in advance for your kind support!
And if you do:
data have;
input Company_ID$ Fiscal_Year Fiscal_Quarter Announcement : yymmdd10.;
format Announcement yymmdd10.;
cards;
1001 2001 2 2001-08-31
1001 2001 3 2001-11-30
1001 2001 4 2002-02-28
1001 2002 1 2002-05-31
1001 2002 2 2002-08-31
1001 2002 3 2002-11-30
1001 2002 4 2003-02-28
1001 2003 1 2003-05-31
1001 2003 2 2003-08-31
1001 2003 3 2003-11-30
1001 2003 4 2004-02-29
1001 2004 1 2004-05-31
1001 2004 2 2004-08-31
2002 2001 3 2001-07-31
2002 2001 4 2001-10-31
2002 2002 1 2002-01-31
2002 2002 2 2002-04-30
2002 2002 3 2002-07-31
2002 2002 4 2002-10-31
2002 2003 1 2003-01-01
2002 2003 2 2003-04-30
2002 2003 3 2003-07-31
2002 2003 4 2003-10-31
2002 2004 1 2004-01-31
3003 2001 4 2001-10-31
3003 2002 1 2002-01-31
4004 2004 4 2004-12-31
4004 2005 1 2005-04-02
5005 2001 1 2001-01-31
;
run;
data temp;
set have;
Before_01JAN2003 = Announcement < '01jan2003'd;
On_or_after_01JAN2003 = Announcement >= '01jan2003'd;
run;
proc summary data = temp nway completetypes;
class Company_ID Fiscal_Quarter;
var Before_01JAN2003 On_or_after_01JAN2003;
output out = want(drop = _:) sum =;
run;
If you don't care about rows where both counts are zero :
proc sql;
create table want as
select Company_ID
, Fiscal_Quarter
, sum (Announcement < "01JAN2003"d) as Before_01JAN2003
, sum (Announcement >= "01JAN2003"d) as On_or_after_01JAN2003
from have
group by Company_ID, Fiscal_Quarter
;
quit;
And if you do:
data have;
input Company_ID$ Fiscal_Year Fiscal_Quarter Announcement : yymmdd10.;
format Announcement yymmdd10.;
cards;
1001 2001 2 2001-08-31
1001 2001 3 2001-11-30
1001 2001 4 2002-02-28
1001 2002 1 2002-05-31
1001 2002 2 2002-08-31
1001 2002 3 2002-11-30
1001 2002 4 2003-02-28
1001 2003 1 2003-05-31
1001 2003 2 2003-08-31
1001 2003 3 2003-11-30
1001 2003 4 2004-02-29
1001 2004 1 2004-05-31
1001 2004 2 2004-08-31
2002 2001 3 2001-07-31
2002 2001 4 2001-10-31
2002 2002 1 2002-01-31
2002 2002 2 2002-04-30
2002 2002 3 2002-07-31
2002 2002 4 2002-10-31
2002 2003 1 2003-01-01
2002 2003 2 2003-04-30
2002 2003 3 2003-07-31
2002 2003 4 2003-10-31
2002 2004 1 2004-01-31
3003 2001 4 2001-10-31
3003 2002 1 2002-01-31
4004 2004 4 2004-12-31
4004 2005 1 2005-04-02
5005 2001 1 2001-01-31
;
run;
data temp;
set have;
Before_01JAN2003 = Announcement < '01jan2003'd;
On_or_after_01JAN2003 = Announcement >= '01jan2003'd;
run;
proc summary data = temp nway completetypes;
class Company_ID Fiscal_Quarter;
var Before_01JAN2003 On_or_after_01JAN2003;
output out = want(drop = _:) sum =;
run;
Based on @PeterClemmensen 's summary table, this expands to all quarters:
data quarters;
input fiscal_quarter;
datalines;
1
2
3
4
;
proc sql;
create table sum as
select
Company_ID
, Fiscal_Quarter
, sum (Announcement < "01JAN2003"d) as Before_01JAN2003
, sum (Announcement >= "01JAN2003"d) as On_or_after_01JAN2003
from have
group by Company_ID, Fiscal_Quarter
;
create table all_q as
select distinct
t2.company_id,
t1.fiscal_quarter
from quarters t1, sum t2
;
create table want as
select
t1.company_id,
t1.fiscal_quarter,
case when t2.company_id ne ""
then t2.Before_01JAN2003
else 0
end as Before_01JAN2003,
case when t2.company_id ne ""
then t2.On_or_after_01JAN2003
else 0
end as On_or_after_01JAN2003
from all_q t1 left join sum t2
on t1.company_id = t2.company_id and t1.fiscal_quarter = t2.fiscal_quarter
;
quit;
data have; input Company_ID$ Fiscal_Year Fiscal_Quarter Announcement : yymmdd10.; format Announcement yymmdd10.; cards; 1001 2001 2 2001-08-31 1001 2001 3 2001-11-30 1001 2001 4 2002-02-28 1001 2002 1 2002-05-31 1001 2002 2 2002-08-31 1001 2002 3 2002-11-30 1001 2002 4 2003-02-28 1001 2003 1 2003-05-31 1001 2003 2 2003-08-31 1001 2003 3 2003-11-30 1001 2003 4 2004-02-29 1001 2004 1 2004-05-31 1001 2004 2 2004-08-31 2002 2001 3 2001-07-31 2002 2001 4 2001-10-31 2002 2002 1 2002-01-31 2002 2002 2 2002-04-30 2002 2002 3 2002-07-31 2002 2002 4 2002-10-31 2002 2003 1 2003-01-01 2002 2003 2 2003-04-30 2002 2003 3 2003-07-31 2002 2003 4 2003-10-31 2002 2004 1 2004-01-31 3003 2001 4 2001-10-31 3003 2002 1 2002-01-31 4004 2004 4 2004-12-31 4004 2005 1 2005-04-02 5005 2001 1 2001-01-31 ; run; proc sql; create table want as select distinct Company_ID,Fiscal_Quarter, (select count(*) from have where Company_ID=a.Company_ID and Fiscal_Quarter=a.Fiscal_Quarter and Announcement<'01jan2003'd) as Before_01JAN2003 , (select count(*) from have where Company_ID=a.Company_ID and Fiscal_Quarter=a.Fiscal_Quarter and Announcement>='01jan2003'd) as On_or_after_01JAN2003 from (select * from (select distinct Company_ID from have),(select distinct Fiscal_Quarter from have)) as a ; quit;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.