Hello
Is there a better way with less code to calculated the following for each team category:
Number of customers
Number of bad customers
Number of good customers
Percent of bad customers in each category
proprtion of bad customers from all bad customers
proprtion of good customers from all good customers
Please provide solution in the following ways:
1-Proc sql
2-proc report
3-proc tabulate
4-proc summary
Data tbl;
Input ID Team Ind_Fail;
Cards;
1 a 0
2 a 1
3 b 0
4 b 1
5 a 0
6 b 0
7 a 1
8 a 0
9 b 1
;
Run;
Proc sql;
Create table as wanted_a as
Select Team,
Count(*) as nr_customers,
Sum(Ind_Fail) as nr_bads,
Sum(calculated nr_customers,- calculated nr_fails) as Nr_goods ,
Calculated nr_bads/ Calculated nr_customers as PCT_bads
From tbl
Group by Team
;
Quit;
Proc sql;
Create table as wanted_b as
Select Count(*) as Total_nr_customers,
Sum(Ind_Fail) as Total_nr_bads,
Sum(calculated Total_nr_customers,- calculated Total_nr_fails) as Total_Nr_goods
From tbl
Group by Team
;
Quit;
Data wanted_c;
Merge wanted_a (in=a) wanted_b(in=b);
proportion_good=Nr_goods/Total_Nr_goods;
proportion_bads=nr_bads/ Total_Nr_bads;
Format proportion_good proportion_bads comma8.2;
Run;
Data tbl;
Input ID Team $ Ind_Fail;
Cards;
1 a 0
2 a 1
3 b 0
4 b 1
5 a 0
6 b 0
7 a 1
8 a 0
9 b 1
;
Run;
proc sql;
create table want as
select Team,
count(*) as n ,
calculated n/(select count(*) from tbl) as percent format=percent7.2 ,
sum(Ind_Fail=1) as n_bad ,
sum(Ind_Fail=0) as n_good ,
sum(Ind_Fail=1)/(select sum(Ind_Fail=1) from tbl ) as bad_percent,
sum(Ind_Fail=0)/(select sum(Ind_Fail=0) from tbl ) as good_percent,
log(calculated bad_percent/calculated good_percent) as woe
from tbl
group by Team;
quit;
Compare your first sql to next fixed code:
Data tbl;
Input ID Team $ Ind_Fail;
Cards;
1 a 0
2 a 1
3 b 0
4 b 1
5 a 0
6 b 0
7 a 1
8 a 0
9 b 1
;
Run;
Proc sql;
Create table wanted_a as
Select Team,
Count(*) as nr_customers,
Sum(Ind_Fail) as nr_bads,
Sum(calculated nr_customers,- calculated nr_bads) as Nr_goods ,
Calculated nr_bads/ Calculated nr_customers as PCT_bads
From tbl
Group by Team
;
Quit;
@Ronein wrote:
Thank you, however in this code there is no calculation of :proportion_good ,proportion_bads.
I have added the required:
Proc sql;
Create table wanted_a as
Select Team,
Count(*) as nr_customers,
Sum(Ind_Fail) as nr_bads,
Sum(calculated nr_customers,- calculated nr_bads) as Nr_goods ,
Calculated nr_bads/ Calculated nr_customers as PCT_bads format percent5.2
From tbl
Group by Team
;
create table wanted_b as
select sum(nr_bads) / sum(nr_customers) as proportion_bads format 4.3,
1- calculated proportion_bads as proportion_good format 4.3
from wanted_a
;
create table want_c as
select a.*, b.*
from wanted_a as a , wanted_b as b
;
Quit;
Data tbl;
Input ID Team $ Ind_Fail;
Cards;
1 a 0
2 a 1
3 b 0
4 b 1
5 a 0
6 b 0
7 a 1
8 a 0
9 b 1
;
Run;
proc sql;
create table want as
select Team,
count(*) as n ,
calculated n/(select count(*) from tbl) as percent format=percent7.2 ,
sum(Ind_Fail=1) as n_bad ,
sum(Ind_Fail=0) as n_good ,
sum(Ind_Fail=1)/(select sum(Ind_Fail=1) from tbl ) as bad_percent,
sum(Ind_Fail=0)/(select sum(Ind_Fail=0) from tbl ) as good_percent,
log(calculated bad_percent/calculated good_percent) as woe
from tbl
group by Team;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.