- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I wanted to count events on dates using count() in proc sql. Here is the sample code:
data have0;
infile cards truncover expandtabs;
input ID $ CODE DATE :date9. DEPT $ ;
format date date8. ;
cards;
A0001 7001 12Oct2015 ETA
A0001 7002 12Oct2015 ETA
A0001 7001 12Nov2015 ETA
A0001 7001 15Nov2016 ETA
A0001 7002 13Oct2016 ETB
A0001 7003 13Oct2015 ETB
A0002 7001 12Oct2015 ETA
A0002 7002 12Oct2015 ETA
A0002 7003 12Oct2015 ETA
A0002 7001 20Oct2016 ETB
A0002 7002 23Oct2016 ETB
A0002 7003 30Nov2016 ETB
;
run;
PROC SQL;
CREATE TABLE events_ct AS
SELECT
count(*) as line_cnt
,count(distinct ID) as ID_cnt
,count(distinct CODE) as CODE_cnt
,count(distinct case when DEPT="ETA" then CODE end) as EtA_CODE_cnt
,count(case when DEPT="ETA" then DATE end) as events_cnt
FROM have0
;
QUIT;
The results for events_cnt shoud be 3 because I want to count the total distinct dates for each ID.
What I need actually is something like this:
count(distinct case when DEPT="ETA" then (DATE from have0 group by ID) end) as events_cnt
but this code doesn't work as it is now. It there any solution to modify the count(case when DEPT="ETA" then DATE end) as events_cnt and get what I need?
Thanks a lot !
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This counts what you want:
count(unique case when DEPT="ETA" then cats(DATE,CODE) end) -1 as EVENTS_CNT
You need to substract 1 as an empty string will also be counted when DEPT <> 'ETA'
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
GROUP BY ID is what you need.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please provide an example showing all the different counts that you want to perform, including expected results.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
These are the 4 records I want to count:
A0001 12Oct2015 ETA
A0001 12Nov2015 ETA
A0001 15Nov2016 ETA
A0002 12Oct2015 ETA
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@leehsin wrote:
These are the 4 records I want to count:
A0001 12Oct2015 ETA
A0001 12Nov2015 ETA
A0001 15Nov2016 ETA
A0002 12Oct2015 ETA
That does NOT look like counts.
What are the counts you want for the 12 observations in your example dataset.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This counts what you want:
count(unique case when DEPT="ETA" then cats(DATE,CODE) end) -1 as EVENTS_CNT
You need to substract 1 as an empty string will also be counted when DEPT <> 'ETA'
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, ChrisNZ! This is the solution I am looking for.
I just change 'CODE' to 'ID' as I want to group the count of dates by ID.
count(unique case when DEPT="ETA" then cats(DATE,ID) end) as EVENTS_CNT
The result to have:
lin_cnt ID_cnt CODE_cnt EtA_CODE_cnt envents_cnt
13 2 3 3 4
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Are you looking for something like this?
proc sql;
create table want as
select id
, count(*) as line_cnt
, count(distinct code) as code_cnt
, count(distinct date) as date_cnt
, count(distinct dept) as dept_cnt
, count(case when dept='ETA' then 1 end) as eta_cnt
, count(distinct case when dept='ETA' then code end) as eta_code_cnt
, count(distinct case when dept='ETA' then date end) as eta_date_cnt
, count(distinct case when dept='ETA' then dept end) as eta_dept_cnt
from have
group by id
;
quit;
eta_ eta_ eta_ Obs ID line_cnt code_cnt date_cnt dept_cnt eta_cnt code_cnt date_cnt dept_cnt 1 A0001 6 3 5 2 4 2 3 1 2 A0002 6 3 4 2 3 3 1 1