Hello
I have loans raw data.
Each row contain information about new loans that were taken .
The fields are :customer ID , quarter, score,loan amount and interest.
I want to calculate also weighted average of interest .
What is the way to do it with my proc tabulate code below?
I also want to ask what is the way to get same output using proc report?
Data Rawtbl;
input ID Quarter $ score $ Loan interest;
cards;
1 Q1 b 200 3.4
2 Q2 b 400 3.3
3 Q3 c 600 3.2
4 Q4 a 800 3.1
5 Q1 a 300 2.9
6 Q2 b 600 3.0
7 Q3 b 900 3.1
8 Q4 b 10000 3.2
9 Q1 a 20000 2.9
10 Q2 b 1500 2.4
11 Q3 a 1000 2.2
12 Q4 c 1800 3.1
13 Q1 c 500 3.2
14 Q2 c 1000 3.6
15 Q3 c 1500 2.4
16 Q4 b 2000 1.9
;
Run;
proc format;
picture p7r (round) 0-100 = '009.00%' ;
picture p6r (round) 0-100 = '09.99%' ;
run;
proc format;
value $ffmt
a='Score a'
b='Score b'
c='Score c'
;
Run;
proc tabulate data=Rawtbl noseps order=freq formchar=' ';
class score;
var Loan interest;
table score='' ALL,
N='No_Of_Customers'*f=comma21.
PCTN='PCT_No_Of_Customers'*f=p7r.
Loan='SUM_Loan'*SUM=''*f=Dollar21.
Loan='PCT_SUM_Loan'*pctsum=''*f=p7r.
/*I want to add weighted average of interest (weight is bLoan Loan)*/
/box='score' rts=10 misstext=' ' condense;
format score $ffmt.;
Run;
Hi @Ronein,
do I understand correctly that you need something like that:
proc tabulate data=Rawtbl noseps order=freq formchar=' ';
class score;
var Loan ;
var interest / WEIGHT=Loan;
table score='' ALL,
N='No_Of_Customers'*f=comma21.
PCTN='PCT_No_Of_Customers'*f=p7r.
Loan='SUM_Loan'*SUM=''*f=Dollar21.
Loan='PCT_SUM_Loan'*pctsum=''*f=p7r.
interest='Mean_weighted_intr'*MEAN=''*f=best.
/box='score' rts=10 misstext=' ' condense;
format score $ffmt.;
Run;
with result:
score | No_Of_Customers | PCT_No_Of_Customers | SUM_Loan | PCT_SUM_Loan | Mean_weighted_intr |
Score b | 7 | 0.4375 | 15600 | 0.3619 | 2.948076923 |
Score c | 5 | 0.3125 | 5400 | 0.1253 | 3.018518519 |
Score a | 4 | 0.25 | 22100 | 0.5128 | 2.875565611 |
All | 16 | 1 | 43100 | 1 | 2.919721578 |
All the best
Bart
Hi @Ronein
As far as I know, while PROC TABULATE can create some percentages of sums and counts, anything involving a "weight" would apply to all calculations.
Here is an attempt to generate your report through Proc Sql.
You could also use PROC FREQ / SUMMARY and then merges to get this.
proc format;
picture p7r (round) 0-100 = '009.00%' ;
value $ffmt a='Score a'
b='Score b'
c='Score c';
Run;
/* sums and ns */
proc sql;
create table want1 as
select distinct a.Score format=$ffmt.,
a.No_Of_Customers format=comma21.,
a.No_Of_Customers * 100 / b.Total_Of_Customers as Tamhil_No_Of_Customers format =p7r.,
a.SUM_Loan format=Dollar21.,
a.SUM_Loan * 100 / b.Total_SUM_Loan as Tamhil_SUM_Loan format=p7r.
from
(select score, count(score) as No_Of_Customers, sum(Loan) as SUM_Loan
from Rawtbl
group by score) as a
inner join
(select score, count(score) as Total_Of_Customers, sum(Loan) as Total_SUM_Loan
from Rawtbl) as b
on a.score = b. score;
quit;
/* Weighted avg at score level */
proc sql;
create table want2 as
select a.Score format=$ffmt., a._Winterest / b.Total_SUM_Loan as Tamhil_WMEAN_Interest format=8.2
from
(select score, sum(interest * loan) as _Winterest
from Rawtbl
group by score) as a
inner join
(select score, sum(Loan) as Total_SUM_Loan
from Rawtbl
group by score) as b
on a.score = b. score;
quit;
/* Weighted avg at global level */
proc sql;
create table want3 as
select sum(interest * loan) / sum(Loan) as Tamhil_WMEAN_Interest format=8.2
from Rawtbl;
quit;
/* Total row */
proc sql;
create table want4 as
select *
from (select 'Total', sum(No_Of_Customers) format=comma21., sum(Tamhil_No_Of_Customers) format=p7r.,
sum(SUM_Loan) format=Dollar21., sum(Tamhil_SUM_Loan) format=p7r.
from want1 as a inner join want2 as b
on a.score = b.score), (select * from want3);
quit;
/* Final table */
proc sql;
(select a.score, No_Of_Customers, Tamhil_No_Of_Customers,
SUM_Loan, Tamhil_SUM_Loan, Tamhil_WMEAN_Interest
from want1 as a inner join want2 as b
on a.score = b.score)
union
(select * from want4)
order by Tamhil_No_Of_Customers;
quit;
Here is another approach:
proc tabulate data=Rawtbl noseps order=freq formchar=' ' out=_temp1 (drop=_:);
class score;
var Loan;
table score='' ALL,
N='No_Of_Customers'*f=comma21.
PCTN='Tamhil_No_Of_Customers'*f=p7r.
Loan='SUM_Loan'*SUM=''*f=Dollar21.
Loan='Tamhil_SUM_Loan'*pctsum=''*f=p7r.;
run;
proc tabulate data=Rawtbl noseps order=freq formchar=' ' out=_temp2 (drop=_:);
class score;
var Loan interest;
table score='' ALL, interest * mean;
weight loan;
run;
proc sort data=_temp1; by score; run;
proc sort data=_temp2; by score; run;
data want;
merge _temp1 _temp2;
by score;
format score $ffmt. N comma21. PctN_0 p7r. Loan_Sum Dollar21. Loan_PctSum_0_Loan p7r. Interest_Mean 8.2;
flag = 0;
if score="" then flag = 1;
run;
proc sort data=want;
by flag score;
run;
proc print data=want (drop=flag) noobs;
run;
Thank you.
Here is summary of all ways that were given in this post.
/****Task:Calculate:
No loans,
sum loans,
PCT No_Loans,
PCT SUM_Loans,
weighted average of interest****/
Data Rawtbl;
input ID Quarter $ score $ Loan interest;
cards;
1 Q1 b 200 3.4
2 Q2 b 400 3.3
3 Q3 c 600 3.2
4 Q4 a 800 3.1
5 Q1 a 300 2.9
6 Q2 b 600 3.0
7 Q3 b 900 3.1
8 Q4 b 10000 3.2
9 Q1 a 20000 2.9
10 Q2 b 1500 2.4
11 Q3 a 1000 2.2
12 Q4 c 1800 3.1
13 Q1 c 500 3.2
14 Q2 c 1000 3.6
15 Q3 c 1500 2.4
16 Q4 b 2000 1.9
;
Run;
proc format;
picture p7r (round) 0-100 = '009.00%' ;
picture p6r (round) 0-100 = '09.99%' ;
run;
proc format;
value $ffmt
a='Score a'
b='Score b'
c='Score c'
;
Run;
/**Way1*********************************/
/**Way1*********************************/
/**Way1*********************************/
/**Way1*********************************/
/**Way1*********************************/
/*when we use weight in proc tabulate then for all calculations of mean it will be calculated as weight mean.
Since in this example we only calculate one mean then the other calculations (N,SUM,PCTN,PCTSUM) will not be effected by the weights.
This is the reason that we can get our desired table just in one step of proc tabulate.*/
proc tabulate data=Rawtbl noseps order=freq formchar=' ';
class score;
var Loan ;
var interest / WEIGHT=Loan;
table score='' ALL,
N='No_Of_Customers'*f=comma21.
PCTN='PCT_No_Of_Customers'*f=p7r.
Loan='SUM_Loan'*SUM=''*f=Dollar21.
Loan='PCT_SUM_Loan'*pctsum=''*f=p7r.
interest='Mean_weighted_intr'*MEAN=''*f=best.
/box='score' rts=10 misstext=' ' condense;
format score $ffmt.;
Run;
/*********Way2-using proc summary and proc tabulate*************/
/*********Way2-using proc summary and proc tabulate*************/
/*********Way2-using proc summary and proc tabulate*************/
/*********Way2-using proc summary and proc tabulate*************/
/*********Way2-using proc summary and proc tabulate*************/
proc summary data=rawtbl;
class score;
var interest/weight=loan;
output out=tbl1 mean=;
run;
ods select none;
proc tabulate data=Rawtbl noseps order=freq formchar=' ' out=tbl2 (drop=_:);
class score;
var Loan;
table score='' ALL,
N='No_Of_Loans'*f=comma21.
PCTN='PCT_No_Of_Loans'*f=p7r.
Loan='SUM_Loan'*SUM=''*f=Dollar21.
Loan='PCT_SUM_Loan'*pctsum=''*f=p7r.;
run;
ods select all;
PROC SQL;
create table want as
select a.*,b.interest as Weighted_AVG_interest
from tbl2 as a
left join tbl1 as b
on a.score=b.score
;
QUIT;
proc print data=want noobs;
run;
/*********Way3-using 2 proc tabulate*************/
/*********Way3-using 2 proc tabulate*************/
/*********Way3-using 2 proc tabulate*************/
/*********Way3-using 2 proc tabulate*************/
/*********Way3-using 2 proc tabulate*************/
/*PROC TABULATE:anything involving "weight" would apply to all calculations*/
ods select none;
proc tabulate data=Rawtbl noseps order=freq formchar=' ' out=t1 (drop=_:);
class score;
var Loan;
table score='' ALL,
N='No_Of_Loans'*f=comma21.
PCTN='PCT_No_Of_Loans'*f=p7r.
Loan='SUM_Loan'*SUM=''*f=Dollar21.
Loan='PCT_SUM_Loan'*pctsum=''*f=p7r.;
run;
ods select all;
ods select none;
proc tabulate data=Rawtbl noseps order=freq formchar=' ' out=t2 (drop=_:);
class score;
var Loan interest;
table score='' ALL, interest * mean;
weight loan;
run;
ods select all;
proc sort data=t1; by score; run;
proc sort data=t2; by score; run;
data t3;
merge t1 t2;
by score;
format score $ffmt.
N comma21.
PctN_0 p7r.
Loan_Sum Dollar21.
Loan_PctSum_0_Loan p7r.
Interest_Mean 8.2;
flag = 0;
if score='' then flag=1;
run;
proc sort data=t3;
by flag score;
run;
Data want;
SET t3;
IF score='' then score='TOTAL';
drop flag;
Run;
proc print data=want noobs;
run;
/****************Way4-proc sql********************/
/****************Way4-proc sql********************/
/****************Way4-proc sql********************/
/****************Way4-proc sql********************/
/****************Way4-proc sql********************/
PROC SQL;
create table tbl1 as
select score,
count(*) as No_Of_Loans format=comma21.,
sum(Loan) as SUM_Loan format=Dollar21.
from Rawtbl
group by score
;
QUIT;
PROC SQL;
create table tbl2 as
select 'Total' as score,
count(*) as No_Of_Loans format=comma21.,
sum(Loan) as SUM_Loan format=Dollar21.
from Rawtbl
;
QUIT;
Data tbl3;
Set tbl1 tbl2;
Run;
PROC SQL;
create table tbl4 as
select a.*,
a.No_Of_Loans/b.No_Of_Loans as PCT_PCT_No_Of_Loans format=percent9.1,
a.SUM_Loan/b.SUM_Loan as PCT_PCT_SUM_Loan format=percent9.1
from tbl3 as a ,tbl2 as b
;
QUIT;
proc sql;
create table tbl5 as
select score,
sum(interest*Loan)/sum(Loan) as Weighted_AVG_interest
from Rawtbl
group by score
;
QUIT;
proc sql;
create table tbl6 as
select 'Total' as score,
sum(interest*Loan)/sum(Loan) as Weighted_AVG_interest
from Rawtbl
;
QUIT;
Data tbl7;
set tbl5 tbl6;
Run;
PROC SQL;
create table wanted as
select a.*,b.Weighted_AVG_interest
from tbl4 as a
left join tbl7 as b
on a.score=b.score
;
QUIT;
title;
proc print data=wanted noobs;Run;
/****************Way5-proc sql********************/
/****************Way5-proc sql********************/
/****************Way5-proc sql********************/
/****************Way5-proc sql********************/
/****************Way5-proc sql********************/
/* sums and ns */
proc sql;
create table t1 as
select distinct a.Score format=$ffmt.,
a.No_Of_Loans format=comma21.,
a.No_Of_Loans * 100 / b.Total_Of_Customers as PCT_No_Of_Loans format =p7r.,
a.SUM_Loan format=Dollar21.,
a.SUM_Loan * 100 / b.Total_SUM_Loan as PCT_SUM_Loan format=p7r.
from
(select score, count(score) as No_Of_Loans, sum(Loan) as SUM_Loan
from Rawtbl
group by score) as a
inner join
(select score, count(score) as Total_Of_Customers, sum(Loan) as Total_SUM_Loan
from Rawtbl) as b
on a.score = b. score;
quit;
/* Weighted avg at score level */
proc sql;
create table t2 as
select a.Score format=$ffmt., a._Winterest / b.Total_SUM_Loan as PCT_WMEAN_Interest format=8.2
from
(select score, sum(interest * loan) as _Winterest
from Rawtbl
group by score) as a
inner join
(select score, sum(Loan) as Total_SUM_Loan
from Rawtbl
group by score) as b
on a.score = b. score;
quit;
/* Weighted avg at global level */
proc sql;
create table t3 as
select sum(interest * loan) / sum(Loan) as PCT_WMEAN_Interest format=8.2
from Rawtbl;
quit;
/* Total row */
proc sql;
create table t4 as
select *
from (select 'Total', sum(No_Of_Loans) format=comma21., sum(PCT_No_Of_Loans) format=p7r.,
sum(SUM_Loan) format=Dollar21., sum(PCT_SUM_Loan) format=p7r.
from want1 as a inner join want2 as b
on a.score = b.score), (select * from want3);
quit;
/* Final table */
proc sql;
create table wanted as
(select a.score, No_Of_Loans, PCT_No_Of_Loans,
SUM_Loan, PCT_SUM_Loan, PCT_WMEAN_Interest
from want1 as a inner join want2 as b
on a.score = b.score)
union
(select * from want4)
order by PCT_No_Of_Loans;
quit;
title;
proc print data=wanted noobs;Run;
Hi @Ronein
OK. So please have a look at the approach which combines the results of 2 proc tabulate.
Best,
@Ronein wrote:
Here is summary of all ways that were given in this post.I just want to know how to it also using proc report?
Compute weighted statistics in PROC SUMMARY, where you can apply weights to individual variables if you want, and where missings are also properly handled. Then either PROC TABULATE or PROC REPORT can create a nice table for you.
proc summary data=rawtbl nway;
class quarter score;
var interest/weight=loan;
output out=_stats_ mean=;
run;
proc report data=_stats_;
columns score quarter,interest;
define score/group;
define quarter/across;
define interest/sum;
Run;
Seems quite a bit easier than doing this in SQL.
It also appears to me that the SQL solutions given above by @ed_sas_member and @Ronein fail and give the wrong answers if there are missing values in the data.
Hi @Ronein,
do I understand correctly that you need something like that:
proc tabulate data=Rawtbl noseps order=freq formchar=' ';
class score;
var Loan ;
var interest / WEIGHT=Loan;
table score='' ALL,
N='No_Of_Customers'*f=comma21.
PCTN='PCT_No_Of_Customers'*f=p7r.
Loan='SUM_Loan'*SUM=''*f=Dollar21.
Loan='PCT_SUM_Loan'*pctsum=''*f=p7r.
interest='Mean_weighted_intr'*MEAN=''*f=best.
/box='score' rts=10 misstext=' ' condense;
format score $ffmt.;
Run;
with result:
score | No_Of_Customers | PCT_No_Of_Customers | SUM_Loan | PCT_SUM_Loan | Mean_weighted_intr |
Score b | 7 | 0.4375 | 15600 | 0.3619 | 2.948076923 |
Score c | 5 | 0.3125 | 5400 | 0.1253 | 3.018518519 |
Score a | 4 | 0.25 | 22100 | 0.5128 | 2.875565611 |
All | 16 | 1 | 43100 | 1 | 2.919721578 |
All the best
Bart
Great!
As I understand when we use weight in proc tabulate then for all calculations of mean it will be calculated as weight mean.
Since in this example we only calculate one mean then the other calculations (N,SUM,PCTN,PCTSUM) will not be effected by the weights.
This is the reason that we can get our desired table just in one step of proc tabulate.
Am I correct???
Hi @Ronein ,
if you need both weighted and not weighted version you can always use a view:
data RawtblView / view = RawtblView;
set Rawtbl;
interest2 = interest;
run;
proc tabulate data=RawtblView noseps order=freq formchar=' ';
class score;
var Loan interest;
var interest2 / WEIGHT=Loan;
table score='' ALL,
N='No_Of_Customers'*f=comma21.
PCTN='PCT_No_Of_Customers'*f=p7r.
Loan='SUM_Loan'*SUM=''*f=Dollar21.
Loan='PCT_SUM_Loan'*pctsum=''*f=p7r.
interest='Mean_intr'*MEAN=''*f=best.
interest2='Mean_weighted_intr'*MEAN=''*f=best.
/box='score' rts=10 misstext=' ' condense;
format score $ffmt.;
Run;
All the best
Bart
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.