I want to figure out how to add up the office_yes indicator variable in two ways, then tack it onto the end of each line.
1. Per Visit
2. Per ID (only summing one 'yes' indicator per visit)
I've tried a several combinations of CASE/WHEN, SUM, COUNT and COUNT(DISTINCT(id), as well as other things like sum(distinct(count(id)))," which gives the message "ERROR: Summary functions nested in this way are not supported."
Otherwise, this is as far as I've gotten:
proc sql; create table want as
select a.* ,
sum(office_yes_by_enc) as total_visits_per_id
from (select
visit_number,
CASE WHEN sum(office_yes)>=1 then 1 ELSE 0 end as office_yes_by_enc,
id,
office_yes
from have group id, visit_number) as a
group id
;quit;
I know there is a straightforward solution with PROC SORT and merge, but there are millions of lines and I'd like to figure out how to do it with SQL.
And is this possible to do one SQL procedure, or does it need to be broken up?
Thanks!
Have:
id | visit_number | office_yes |
8 | 59 | 1 |
8 | 59 | . |
8 | 10 | . |
8 | 19 | . |
2 | 51 | 1 |
2 | 70 | 1 |
2 | 45 | . |
2 | 45 | 1 |
2 | 45 | 1 |
2 | 35 | . |
2 | 62 | 1 |
5 | 87 | 1 |
5 | 66 | 1 |
Want:
id | visit_number | office_yes | office_yes_by_visit | total_office_visits_by_id |
8 | 59 | 1 | 1 | 1 |
8 | 59 | . | 1 | 1 |
8 | 10 | . | 0 | 1 |
8 | 19 | . | 0 | 1 |
2 | 51 | 1 | 1 | 4 |
2 | 70 | 1 | 1 | 4 |
2 | 45 | . | 1 | 4 |
2 | 45 | 1 | 1 | 4 |
2 | 45 | 1 | 1 | 4 |
2 | 35 | . | 0 | 4 |
2 | 62 | 1 | 1 | 4 |
5 | 87 | 1 | 1 | 2 |
5 | 66 | 1 | 1 | 2 |
So you want count office_yes for unique combination of id and visit_number ?
data have; input id visit_number office_yes ; cards; 8 59 1 8 59 . 8 10 . 8 19 . 2 51 1 2 70 1 2 45 . 2 45 1 2 45 1 2 35 . 2 62 1 5 87 1 5 66 1 ; run; proc sql; create table want as select *,coalesce((select distinct office_yes from have where id=a.id and visit_number=a.visit_number and office_yes is not missing),0) as office_yes_by_visit, (select sum(office_yes) from (select distinct * from have where id=a.id)) as total_office_visits_by_id from have as a; quit;
Xia Keshan
Try this.
proc sql;
create table want as
select a.*,coalesce(c.office_yes_by_visit,0) as office_yes_by_visit,b.total_office_visits_by_id from have a
left join (select id,count(distinct visit_number) as total_office_visits_by_id from have
where office_yes=1
group by id) b
on a.id=b.id
left join
(select id,visit_number,count(distinct visit_number) as office_yes_by_visit from have
where office_yes=1
group by id,visit_number) c
on a.id=c.id
and a.visit_number=c.visit_number;
quit;
Sure.
data have; input id visit_number office_yes ; cards; 8 59 1 8 59 . 8 10 . 8 19 . 2 51 1 2 70 1 2 45 . 2 45 1 2 45 1 2 35 . 2 62 1 5 87 1 5 66 1 ; run; proc sql; create table want as select *,coalesce((select distinct office_yes from have where id=a.id and visit_number=a.visit_number and office_yes is not missing),0) as office_yes_by_visit,sum(office_yes) as total_office_visits_by_id from have as a group by id ; quit;
Xia Keshan
Xia,
Thanks for the reply! It looks like this code will sum up all of the 'yes' indicators per visit, instead of just one, which was the same problem I kept having too.
So, the output for ID #2 should be 4 for total_office_visits_by_id, but with the code above, it comes out to be 5.
M
So you want count office_yes for unique combination of id and visit_number ?
data have; input id visit_number office_yes ; cards; 8 59 1 8 59 . 8 10 . 8 19 . 2 51 1 2 70 1 2 45 . 2 45 1 2 45 1 2 35 . 2 62 1 5 87 1 5 66 1 ; run; proc sql; create table want as select *,coalesce((select distinct office_yes from have where id=a.id and visit_number=a.visit_number and office_yes is not missing),0) as office_yes_by_visit, (select sum(office_yes) from (select distinct * from have where id=a.id)) as total_office_visits_by_id from have as a; quit;
Xia Keshan
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.