BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dr2014
Quartz | Level 8

Hi All,

I have to calculate proportions of 'yes' answers for group1. In group1 there are multiple agencies. Within each agency (represented by agencyid ) are several units (represented by unitid). For a survey conducted for the group1, I need to calculate the proportion of 'yes' answers for a set of questions (represented as questionid ) across all units in an agency. Calculation is as follows:

Numerator    = (Sum of all 'yes' answers  by all units in the agency  / 24))  * 100

Denominator                 number of units within the agency


24 represents total number of questions

Here is my code :

proc sql;

create table p as

select distinct agencyid, count(case answer when 'yes' then 1 else 0 end) as yes_count,

       ((count(case answer when 'yes' then 1 else 0 end))/24) as numerator,

       count(distinct unitid) as denominator, ((calculated numerator/calculated denominator)*100) as proportion

from test1

  where questionid in ('Q1','Q2','Q3') and agencyid=18;

quit;


I want to repeat this calculation for other agencyid's in the group. However, the values are not consecutive. For example , for this group1, the agencyid's are 18,19 and 36. I offcourse  knew these 3 agencies by running a proc freq. Instead of doing that, I would like to create a generic code where the calculation starts with the minimum value of the agencyid and ends with the maximum value and ignores the non-existing values between 19 to 36. It should create a table (just like table p -mentioned above) , but also include calculations for other agencies such as :


Group1:

agencyid yes_count numerator denominator proportion

18        ##       ###        ###        ###

19        ##       ###        ###        ###

36        ##       ###        ###        ###


I also created a keyword macro , but it will be very tedious to then stack the individual datasets especially for groups with plenty of agencies. here is that code:

%let v1=18;

%let v2=19;

%let v3=36;

%macro aid (v);

proc sql;

create table p&v. as

select distinct agencyid, count(case answer when 'yes' then 1 else 0 end) as

        yes_count,  ((count(case answer when 'yes' then 1 else 0 end))/24) as numerator,

       count(distinct unitid) as denominator, ((calculated numerator/calculated denominator)*100) as proportion

from test1

  where questionid in ('Q1','Q3','Q5') and agencyid=&v.;

%end;

quit;

%mend faid;

%faid (&v1);

%faid (&v2);

%faid (&v3);

Please advice and let me know your thoughts.

Thanks !






1 ACCEPTED SOLUTION

Accepted Solutions
gergely_batho
SAS Employee

Would this worki?:

proc sql;

create table p as

select agencyid, count(case answer when 'yes' then 1 else 0 end) as yes_count,

       ((count(case answer when 'yes' then 1 else 0 end))/24) as numerator,

       count(distinct unitid) as denominator, ((calculated numerator/calculated denominator)*100) as proportion

from test1

  where questionid in ('Q1','Q2','Q3')

group by agencyid;

quit;

View solution in original post

6 REPLIES 6
gergely_batho
SAS Employee

Would this worki?:

proc sql;

create table p as

select agencyid, count(case answer when 'yes' then 1 else 0 end) as yes_count,

       ((count(case answer when 'yes' then 1 else 0 end))/24) as numerator,

       count(distinct unitid) as denominator, ((calculated numerator/calculated denominator)*100) as proportion

from test1

  where questionid in ('Q1','Q2','Q3')

group by agencyid;

quit;

slchen
Lapis Lazuli | Level 10

Not sure whether this is what you need:

%macro aid (v);

%let i=1;

%do %while (%scan(&v,&i)ne);

%let var=%scan(&v,&i);

proc sql;

create table p&var. as

select distinct agencyid, sum(answer='yes') as

        yes_count, sum(answer='yes')/24 as numerator,

       count(distinct unitid) as denominator, ((calculated numerator/calculated denominator)*100) as proportion

from test1

where questionid in ('Q1','Q3','Q5') and agencyid=&var.;

quit;

%let i=%eval(&i+1);

%end;

%mend aid;

%aid(%str(18 19 36))

dr2014
Quartz | Level 8

Thanks Gergely. That worked. I didn't think about the 'group by' ! @slchen...I was wanting to avoid the using keyword names. But, will try this for another scenario. Thanks for the suggestion on the 'sum' function instead of count.

dr2014
Quartz | Level 8

Hi Gregely and slchen,

Although Gregely's suggestion worked. I have one more calculation I need to make. I also need a column with an average value of the proportions. I am unable to use the avg function when I use the 'group by' option and have to perform another proc sql to get that column. Is there a way to do a subquery within the original SQL.

I basically want the table to look like this now :

Group1:

agencyid yes_count numerator denominator proportion avg_proportion

18        ##       ###        ###        ###        ###

19        ##       ###        ###        ###

36        ##       ###        ###        ###

proc sql;

create table p as

select agencyid, count(case answer when 'yes' then 1 else 0 end) as yes_count,

       ((count(case answer when 'yes' then 1 else 0 end))/24) as numerator,

       count(distinct unitid) as denominator, ((calculated numerator/calculated denominator)*100) as proportion

from test1

  where questionid in ('Q1','Q2','Q3')

group by agencyid;

quit;


proc sql;

create table q as

select agencyid , yes_count, numerator, denominator, proportion, avg(proportion) as grp_avg,

from p,

quit;


Also with this code , the average value gets repeated for each row in the table.


Please let me know.


slchen
Lapis Lazuli | Level 10

Try this:

proc sql;

create table p as

select *, avg(proportion) as grp_avg from(select agencyid, count(case answer when 'yes' then 1 else 0 end) as yes_count,

       ((count(case answer when 'yes' then 1 else 0 end))/24) as numerator,

       count(distinct unitid) as denominator, ((calculated numerator/calculated denominator)*100) as proportion

from test1

where questionid in ('Q1','Q2','Q3')

group by agencyid);

quit;

dr2014
Quartz | Level 8

Thanks slchen. It did work. I have one more question. I wanted to have only the first row of the table to mention the value for grp_avg though. Would I need to create an id and then specify that.. id from 2 to eof grp_avg='   '. What would be the best way to accomplish that ?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1543 views
  • 3 likes
  • 3 in conversation