Solved: proc sql for repeat calulations using a multilevel dataset

dr2014 · Posted 05-26-2015 05:57 PM

Hi All,

I am working with a dataset with the following multilevel layout.

agencyid	unitid	itemnumber	answer
1	2	201	YES
1	2	202	YES
1	2	203	YES
1	2	204	YES
1	3	201	YES
1	3	202	YES
1	3	203	YES
1	3	204	YES
1	4	201	YES
1	4	202	NO
1	4	203	YES
1	4	204	YES

The agency id ranges from 1 to 42. There are several unitids within an agencyid. Each unitid will answer itemnumbers within a question. I need the number of units within an agency that have all 'yes' answers to all the itemnumbers 201 to 204.

For eg here the count of unique unitids that have a 'yes' response to all the itemnumbers is 2.

How can I improvise my code below to account for 'yes' responses to all the itemnumbers 201 to 204 ? The below code only gives me counts of unitids for any one of the itemnumbers with a yes answer.

proc sql;

create table c as

select agencyid,

count(distinct unitid)

as numerator

from test /* name of dataset*/

where answer='Yes' and itemnumber in (201,202,203,204)

group by agencyid;

quit;

Please advice. Thanks !

PGStats · Posted 05-26-2015 09:37 PM

You need a subquery :

proc sql;

create table want as

select agencyId, count(unitId) as nbUnits

from

(select agencyId, unitId

from test

where answer = "YES" and itemNumber in (201, 202, 203, 204)

group by agencyId, unitId

having count(distinct itemNumber)=4)

group by agencyId;

quit;

PG

View solution in original post

PGStats · Posted 05-26-2015 09:37 PM

You need a subquery :

proc sql;

create table want as

select agencyId, count(unitId) as nbUnits

from

(select agencyId, unitId

from test

where answer = "YES" and itemNumber in (201, 202, 203, 204)

group by agencyId, unitId

having count(distinct itemNumber)=4)

group by agencyId;

quit;

PG

stat_sas · Posted 05-26-2015 10:29 PM

proc sql;

create table c as

select agencyid,unitid from have

group by agencyid,unitid

having sum(answer='YES')=4;

quit;

dr2014 · Posted 05-27-2015 09:54 AM

Hi PGStats and stat@sas,

Thanks for your help, but I am still not getting the right answer...

The results table based on my example has to look like below

agencyid	Unitid count
1	2
2	1
3	6

I used both of your syntax and came up with this improvisation to my code. However, I get 0 rows and 2 columns . I not sure whats wrong ? Please let me know.

proc sql;

create table c as

select agencyid,

count(distinct unitid)

as numerator

from test

where itemid in (201,202,203,204)

group by agencyid

having sum(answer='Yes')=4;

quit;

PGStats · Posted 05-27-2015 11:16 AM

When I use my code with your example data (which covers agencyId=1 only) I get :

agencyid nbUnits

------------------

1 2

In your new query, you group on agencyId and ask for the group condition sum(answer='Yes')=4 . You are thus selecting agencies (not units) with 4 YES answers. That's why you need a subquery. The lowest (subquery) level groups on agency and units to extract the units with 4 YES answers. The higher level counts the selected units within each agency.

PG

dr2014 · Posted 05-27-2015 12:12 PM

Thanks PG for the explanation. That makes sense. Let me give it a try and I shall get back.

dr2014 · Posted 05-27-2015 01:38 PM

Thanks PG ! It worked. I had an error in my syntax ! I broke your code down and ran it step wise.

proc sql for repeat calulations using a multilevel dataset

Re: proc sql for repeat calulations using a multilevel dataset

Re: proc sql for repeat calulations using a multilevel dataset

Re: proc sql for repeat calulations using a multilevel dataset

Re: proc sql for repeat calulations using a multilevel dataset

Re: proc sql for repeat calulations using a multilevel dataset

Re: proc sql for repeat calulations using a multilevel dataset

Re: proc sql for repeat calulations using a multilevel dataset

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away