Dear altruist,
I have the following SAS dataset:
data have;
input Company_ID$ Fiscal_Year Fiscal_Quarter Announcement : yymmdd10.;
format Announcement yymmdd10.;
cards;
1001 2001 2 2001-08-31
1001 2001 3 2001-11-30
1001 2001 4 2002-02-28
1001 2002 1 2002-05-31
1001 2002 2 2002-08-31
1001 2002 3 2002-11-30
1001 2002 4 2003-02-28
1001 2003 1 2003-05-31
1001 2003 2 2003-08-31
1001 2003 3 2003-11-30
1001 2003 4 2004-02-29
1001 2004 1 2004-05-31
1001 2004 2 2004-08-31
2002 2001 3 2001-07-31
2002 2001 4 2001-10-31
2002 2002 1 2002-01-31
2002 2002 2 2002-04-30
2002 2002 3 2002-07-31
2002 2002 4 2002-10-31
2002 2003 1 2003-01-01
2002 2003 2 2003-04-30
2002 2003 3 2003-07-31
2002 2003 4 2003-10-31
2002 2004 1 2004-01-31
3003 2001 4 2001-10-31
3003 2002 1 2002-01-31
4004 2004 4 2004-12-31
4004 2005 1 2005-04-02
5005 2001 1 2001-01-31
;
run;
I want to know what is the total number of observations per Fiscal_Quarter based on the the variable Announcement_Date which is Before and On-or-after 01JAN2003.
In the output file, every company will have the all the four fiscal quarters calculated, even if a company only has one observation (Please see Company_ID 5005). Thus, the Fiscal_Quarters 1-4 is displayed for all companies.
Essentially, I am expecting the following output:
Company_ID | Fiscal_Quarter | Before_01JAN2003 | On_or_after_01JAN2003 |
1001 | 1 | 1 | 2 |
1001 | 2 | 2 | 2 |
1001 | 3 | 2 | 1 |
1001 | 4 | 1 | 2 |
2002 | 1 | 1 | 2 |
2002 | 2 | 1 | 1 |
2002 | 3 | 2 | 1 |
2002 | 4 | 2 | 1 |
3003 | 1 | 1 | 0 |
3003 | 2 | 0 | 0 |
3003 | 3 | 0 | 0 |
3003 | 4 | 1 | 0 |
4004 | 1 | 0 | 1 |
4004 | 2 | 0 | 0 |
4004 | 3 | 0 | 0 |
4004 | 4 | 0 | 1 |
5005 | 1 | 1 | 0 |
5005 | 2 | 0 | 0 |
5005 | 3 | 0 | 0 |
5005 | 4 | 0 | 0 |
I thank you in advance for your kind support!
And if you do:
data have;
input Company_ID$ Fiscal_Year Fiscal_Quarter Announcement : yymmdd10.;
format Announcement yymmdd10.;
cards;
1001 2001 2 2001-08-31
1001 2001 3 2001-11-30
1001 2001 4 2002-02-28
1001 2002 1 2002-05-31
1001 2002 2 2002-08-31
1001 2002 3 2002-11-30
1001 2002 4 2003-02-28
1001 2003 1 2003-05-31
1001 2003 2 2003-08-31
1001 2003 3 2003-11-30
1001 2003 4 2004-02-29
1001 2004 1 2004-05-31
1001 2004 2 2004-08-31
2002 2001 3 2001-07-31
2002 2001 4 2001-10-31
2002 2002 1 2002-01-31
2002 2002 2 2002-04-30
2002 2002 3 2002-07-31
2002 2002 4 2002-10-31
2002 2003 1 2003-01-01
2002 2003 2 2003-04-30
2002 2003 3 2003-07-31
2002 2003 4 2003-10-31
2002 2004 1 2004-01-31
3003 2001 4 2001-10-31
3003 2002 1 2002-01-31
4004 2004 4 2004-12-31
4004 2005 1 2005-04-02
5005 2001 1 2001-01-31
;
run;
data temp;
set have;
Before_01JAN2003 = Announcement < '01jan2003'd;
On_or_after_01JAN2003 = Announcement >= '01jan2003'd;
run;
proc summary data = temp nway completetypes;
class Company_ID Fiscal_Quarter;
var Before_01JAN2003 On_or_after_01JAN2003;
output out = want(drop = _:) sum =;
run;
If you don't care about rows where both counts are zero :
proc sql;
create table want as
select Company_ID
, Fiscal_Quarter
, sum (Announcement < "01JAN2003"d) as Before_01JAN2003
, sum (Announcement >= "01JAN2003"d) as On_or_after_01JAN2003
from have
group by Company_ID, Fiscal_Quarter
;
quit;
And if you do:
data have;
input Company_ID$ Fiscal_Year Fiscal_Quarter Announcement : yymmdd10.;
format Announcement yymmdd10.;
cards;
1001 2001 2 2001-08-31
1001 2001 3 2001-11-30
1001 2001 4 2002-02-28
1001 2002 1 2002-05-31
1001 2002 2 2002-08-31
1001 2002 3 2002-11-30
1001 2002 4 2003-02-28
1001 2003 1 2003-05-31
1001 2003 2 2003-08-31
1001 2003 3 2003-11-30
1001 2003 4 2004-02-29
1001 2004 1 2004-05-31
1001 2004 2 2004-08-31
2002 2001 3 2001-07-31
2002 2001 4 2001-10-31
2002 2002 1 2002-01-31
2002 2002 2 2002-04-30
2002 2002 3 2002-07-31
2002 2002 4 2002-10-31
2002 2003 1 2003-01-01
2002 2003 2 2003-04-30
2002 2003 3 2003-07-31
2002 2003 4 2003-10-31
2002 2004 1 2004-01-31
3003 2001 4 2001-10-31
3003 2002 1 2002-01-31
4004 2004 4 2004-12-31
4004 2005 1 2005-04-02
5005 2001 1 2001-01-31
;
run;
data temp;
set have;
Before_01JAN2003 = Announcement < '01jan2003'd;
On_or_after_01JAN2003 = Announcement >= '01jan2003'd;
run;
proc summary data = temp nway completetypes;
class Company_ID Fiscal_Quarter;
var Before_01JAN2003 On_or_after_01JAN2003;
output out = want(drop = _:) sum =;
run;
Based on @PeterClemmensen 's summary table, this expands to all quarters:
data quarters;
input fiscal_quarter;
datalines;
1
2
3
4
;
proc sql;
create table sum as
select
Company_ID
, Fiscal_Quarter
, sum (Announcement < "01JAN2003"d) as Before_01JAN2003
, sum (Announcement >= "01JAN2003"d) as On_or_after_01JAN2003
from have
group by Company_ID, Fiscal_Quarter
;
create table all_q as
select distinct
t2.company_id,
t1.fiscal_quarter
from quarters t1, sum t2
;
create table want as
select
t1.company_id,
t1.fiscal_quarter,
case when t2.company_id ne ""
then t2.Before_01JAN2003
else 0
end as Before_01JAN2003,
case when t2.company_id ne ""
then t2.On_or_after_01JAN2003
else 0
end as On_or_after_01JAN2003
from all_q t1 left join sum t2
on t1.company_id = t2.company_id and t1.fiscal_quarter = t2.fiscal_quarter
;
quit;
data have; input Company_ID$ Fiscal_Year Fiscal_Quarter Announcement : yymmdd10.; format Announcement yymmdd10.; cards; 1001 2001 2 2001-08-31 1001 2001 3 2001-11-30 1001 2001 4 2002-02-28 1001 2002 1 2002-05-31 1001 2002 2 2002-08-31 1001 2002 3 2002-11-30 1001 2002 4 2003-02-28 1001 2003 1 2003-05-31 1001 2003 2 2003-08-31 1001 2003 3 2003-11-30 1001 2003 4 2004-02-29 1001 2004 1 2004-05-31 1001 2004 2 2004-08-31 2002 2001 3 2001-07-31 2002 2001 4 2001-10-31 2002 2002 1 2002-01-31 2002 2002 2 2002-04-30 2002 2002 3 2002-07-31 2002 2002 4 2002-10-31 2002 2003 1 2003-01-01 2002 2003 2 2003-04-30 2002 2003 3 2003-07-31 2002 2003 4 2003-10-31 2002 2004 1 2004-01-31 3003 2001 4 2001-10-31 3003 2002 1 2002-01-31 4004 2004 4 2004-12-31 4004 2005 1 2005-04-02 5005 2001 1 2001-01-31 ; run; proc sql; create table want as select distinct Company_ID,Fiscal_Quarter, (select count(*) from have where Company_ID=a.Company_ID and Fiscal_Quarter=a.Fiscal_Quarter and Announcement<'01jan2003'd) as Before_01JAN2003 , (select count(*) from have where Company_ID=a.Company_ID and Fiscal_Quarter=a.Fiscal_Quarter and Announcement>='01jan2003'd) as On_or_after_01JAN2003 from (select * from (select distinct Company_ID from have),(select distinct Fiscal_Quarter from have)) as a ; quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.