Hi, I have a data pretty much like below:
Type | Year | A | B | C |
1 | 2011 | 0 | 1 | 0 |
1 | 2011 | 1 | 0 | 1 |
1 | 2012 | 1 | 1 | 1 |
1 | 2012 | 0 | 1 | 0 |
2 | 2011 | 0 | 0 | 0 |
2 | 2011 | 1 | 0 | 1 |
2 | 2012 | 1 | 0 | 1 |
2 | 2012 | 1 | 1 | 1 |
And I want to compute the percentage for A, B, C by each group Type and Year, the result is pretty much like below:
Type | Year | A percent | B percent | C percent |
1 | 2011 | 50% | 50% | 50% |
1 | 2012 | 50% | 100% | 50% |
2 | 2011 | 50% | 0% | 50% |
2 | 2012 | 100% | 50% | 100% |
Do you know how to do it? When I used the below code, the result was different from what I expected, it had lots of years for each type.
proc sql;
create table rate as
select type, year, A, B, C, sum(A)/count(A) as pctA,
sum(B)/count(B) as pctB, sum(C)/count(C) as pctC
from (select type, year, A, B, C from one group by type, year)
group by type, year;
quit;
Could anyone help me with it? Thank you!
Proc means is probably more efficient.
For binary values, the mean is equivalent to the percentage mathematically.
The SQL doesn't work because A, B, C are included in the SELECT statement. The log should have a message indicating that this means it's merging the table with the summary rather than just having the summary data. The SQL is also redundant, the subquery isn't required at all.
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 68 69 proc sql; 70 create table rate as 71 select type, year, A, B, C, sum(A)/count(A) as pctA, 72 sum(B)/count(B) as pctB, sum(C)/count(C) as pctC 73 from (select type, year, A, B, C from have group by type, year) 74 group by type, year; NOTE: A GROUP BY clause has been discarded because neither the SELECT clause nor the optional HAVING clause of the associated table-expression referenced a summary function. NOTE: The query requires remerging summary statistics back with the original data. NOTE: Table WORK.RATE created, with 8 rows and 8 columns.
Here are two ways to do this, noting that proc means will scale much easier if you have a lot of variables.
proc means data=have MEAN NWAY;
class type year;
var a b c;
output out=want mean= /autoname;
run;
proc print data=want noobs;
format a_mean b_mean c_mean percent12.1;
run;
proc sql;
create table rate as
select type, year, sum(A)/count(A) as pctA,
sum(B)/count(B) as pctB, sum(C)/count(C) as pctC
from have
group by type, year;
quit;
Proc means is probably more efficient.
For binary values, the mean is equivalent to the percentage mathematically.
The SQL doesn't work because A, B, C are included in the SELECT statement. The log should have a message indicating that this means it's merging the table with the summary rather than just having the summary data. The SQL is also redundant, the subquery isn't required at all.
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 68 69 proc sql; 70 create table rate as 71 select type, year, A, B, C, sum(A)/count(A) as pctA, 72 sum(B)/count(B) as pctB, sum(C)/count(C) as pctC 73 from (select type, year, A, B, C from have group by type, year) 74 group by type, year; NOTE: A GROUP BY clause has been discarded because neither the SELECT clause nor the optional HAVING clause of the associated table-expression referenced a summary function. NOTE: The query requires remerging summary statistics back with the original data. NOTE: Table WORK.RATE created, with 8 rows and 8 columns.
Here are two ways to do this, noting that proc means will scale much easier if you have a lot of variables.
proc means data=have MEAN NWAY;
class type year;
var a b c;
output out=want mean= /autoname;
run;
proc print data=want noobs;
format a_mean b_mean c_mean percent12.1;
run;
proc sql;
create table rate as
select type, year, sum(A)/count(A) as pctA,
sum(B)/count(B) as pctB, sum(C)/count(C) as pctC
from have
group by type, year;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.