Hello
In the following code I want to calculate PCT of Sum Loan.
The problem is that I want to get calculation for each quarter_year value separately.
So....Total for each quarter_year will be 100%.
Data rawdata;
input ID quarter_year$ year Sum_Loan sum_profit score;
cards;
1 Q1-2017 2017 10 1 2
2 Q1-2017 2017 20 2 2
3 Q1-2017 2017 30 3 3
4 Q1-2017 2017 40 4 3
5 Q1-2017 2017 50 5 4
6 Q2-2017 2017 60 6 2
7 Q2-2017 2017 70 7 4
8 Q2-2017 2017 80 8 3
9 Q3-2017 2017 90 9 4
10 Q3-2017 2017 100 1 2
11 Q3-2017 2017 110 2 3
12 Q3-2017 2017 120 3 3
13 Q3-2017 2017 130 4 3
14 Q3-2017 2017 140 5 3
15 Q3-2017 2017 150 6 4
16 Q3-2017 2017 160 7 2
17 Q3-2017 2017 170 8 4
18 Q3-2017 2017 180 9 2
19 Q3-2017 2017 190 10 2
20 Q3-2017 2017 200 11 2
;
run;
proc format library=work;
value $qy (multilabel notsorted)
'Q1-2017'='Q1-2017'
'Q2-2017'='Q2-2017'
'Q3-2017'='Q3-2017'
'Q4-2017'='Q4-2017'
;
run;
proc tabulate data=rawdata;
class score quarter_year /mlf;
format quarter_year $qy.;
var Sum_Loan sum_profit;
table score='',quarter_year*sum_loan=''* sum='Sum of loans'*f=best8.
quarter_year*sum_loan=''* N='Number of loans'*f=best8.
quarter_year*sum_loan=''* PCTSUM='PCT of sum loans'*f=best8.
/row=float box='Statistic'
;
run;
As I showed you in my first post, you can do it in one SQL:
proc sql;
create table want as
select
*,
sum_loan / sum(sum_loan) format=percent7.2 as pct_sum_loan
from rawdata
group by quarter_year
order by id;
quit;
proc tabulate data=want;
class score quarter_year;
var Sum_Loan sum_profit pct_sum_loan;
table score='',quarter_year*sum_loan=''* sum='Sum of loans'*f=best8.
quarter_year*sum_loan=''* N='Number of loans'*f=best8.
quarter_year*sum_profit=''* N='Sum of profit'*f=best8.
quarter_year*pct_sum_loan=''*SUM='PCT of sum loans'*f=percent7.2./row=float box='Statistic';
run;
Note that your format just reformats all values to themselves, and is therefore useless.
Please test example data code before posting. I had to remove a lot of tabs in the datalines block to make it work.
Hint: set your Enhanced Editor to replace tabs with spaces.
You only need a SQL for this:
Data rawdata;
input ID quarter_year$ year Sum_Loan sum_profit score;
cards;
1 Q1-2017 2017 10 1 2
2 Q1-2017 2017 20 2 2
3 Q1-2017 2017 30 3 3
4 Q1-2017 2017 40 4 3
5 Q1-2017 2017 50 5 4
6 Q2-2017 2017 60 6 2
7 Q2-2017 2017 70 7 4
8 Q2-2017 2017 80 8 3
9 Q3-2017 2017 90 9 4
10 Q3-2017 2017 100 1 2
11 Q3-2017 2017 110 2 3
12 Q3-2017 2017 120 3 3
13 Q3-2017 2017 130 4 3
14 Q3-2017 2017 140 5 3
15 Q3-2017 2017 150 6 4
16 Q3-2017 2017 160 7 2
17 Q3-2017 2017 170 8 4
18 Q3-2017 2017 180 9 2
19 Q3-2017 2017 190 10 2
20 Q3-2017 2017 200 11 2
;
run;
proc sql;
create table want as
select
id,
quarter_year,
sum_loan,
sum_loan / sum(sum_loan) format=percent7.2 as pct_sum_loan
from rawdata
group by quarter_year
order by id;
quit;
proc print data= want noobs;
run;
Result:
quarter_ pct_sum_ ID year Sum_Loan loan 1 Q1-2017 10 6.67% 2 Q1-2017 20 13.3% 3 Q1-2017 30 20.0% 4 Q1-2017 40 26.7% 5 Q1-2017 50 33.3% 6 Q2-2017 60 28.6% 7 Q2-2017 70 33.3% 8 Q2-2017 80 38.1% 9 Q3-2017 90 5.17% 10 Q3-2017 100 5.75% 11 Q3-2017 110 6.32% 12 Q3-2017 120 6.90% 13 Q3-2017 130 7.47% 14 Q3-2017 140 8.05% 15 Q3-2017 150 8.62% 16 Q3-2017 160 9.20% 17 Q3-2017 170 9.77% 18 Q3-2017 180 10.3% 19 Q3-2017 190 10.9% 20 Q3-2017 200 11.5%
Thank you.
I want to learn to do it via proc tabulate.
I want to understand PCTSUM option better and calculate it per group and not per global sum
This solution is good but my question was how to do it without creating another raw data table (Help1)
Data rawdata;
input ID quarter_year$ year Sum_Loan sum_profit score;
cards;
1 Q1-2017 2017 10 1 2
2 Q1-2017 2017 20 2 2
3 Q1-2017 2017 30 3 3
4 Q1-2017 2017 40 4 3
5 Q1-2017 2017 50 5 4
6 Q2-2017 2017 60 6 2
7 Q2-2017 2017 70 7 4
8 Q2-2017 2017 80 8 3
9 Q3-2017 2017 90 9 4
10 Q3-2017 2017 100 1 2
11 Q3-2017 2017 110 2 3
12 Q3-2017 2017 120 3 3
13 Q3-2017 2017 130 4 3
14 Q3-2017 2017 140 5 3
15 Q3-2017 2017 150 6 4
16 Q3-2017 2017 160 7 2
17 Q3-2017 2017 170 8 4
18 Q3-2017 2017 180 9 2
19 Q3-2017 2017 190 10 2
20 Q3-2017 2017 200 11 2
;
run;
proc format library=work;
value $qy (multilabel notsorted)
'Q1-2017'='Q1-2017'
'Q2-2017'='Q2-2017'
'Q3-2017'='Q3-2017'
'Q4-2017'='Q4-2017'
'Q1-2018'='Q1-2018'
'Q2-2018'='Q2-2018'
'Q3-2018'='Q3-2018'
'Q4-2018'='Q4-2018'
;
run;
proc sql;
create table Help1 as
select quarter_year,
sum(sum_loan) as Total_sum_loan_per_quarter_year
from rawdata
group by quarter_year
;
quit;
PROC SQL;
create table want as
select a.*,
b.Total_sum_loan_per_quarter_year,
a.sum_loan/b.Total_sum_loan_per_quarter_year*100 as pct_sum_loan
from rawdata as a
left join Help1 as b
on a.quarter_year=b.quarter_year
;
QUIT;
proc tabulate data=want;
class score quarter_year /mlf;
format quarter_year $qy.;
var Sum_Loan sum_profit pct_sum_loan;
table score='',quarter_year*sum_loan=''* sum='Sum of loans'*f=best8.
quarter_year*sum_loan=''* N='Number of loans'*f=best8.
quarter_year*sum_profit=''* N='Sum of profit'*f=best8.
quarter_year*pct_sum_loan=''*SUM='PCT of sum loans'*f=best8./row=float box='Statistic';
run;
/*Not good!!!!!!!!!!!!!!!!-Why???????????*/
/*proc tabulate data=rawdata;*/
/* class score quarter_year /mlf;*/
/* format quarter_year $qy.;*/
/* var Sum_Loan sum_profit;*/
/* table score='',quarter_year*sum_loan=''* sum='Sum of loans'*f=best8.*/
/* quarter_year*sum_loan=''* N='Number of loans'*f=best8.*/
/* quarter_year*sum_loan=''* PCTSUM='PCT of sum loans'*f=best8.*/
/**/
/*/row=float box='Statistic'*/
/* ;*/
/*run;*/
As I showed you in my first post, you can do it in one SQL:
proc sql;
create table want as
select
*,
sum_loan / sum(sum_loan) format=percent7.2 as pct_sum_loan
from rawdata
group by quarter_year
order by id;
quit;
proc tabulate data=want;
class score quarter_year;
var Sum_Loan sum_profit pct_sum_loan;
table score='',quarter_year*sum_loan=''* sum='Sum of loans'*f=best8.
quarter_year*sum_loan=''* N='Number of loans'*f=best8.
quarter_year*sum_profit=''* N='Sum of profit'*f=best8.
quarter_year*pct_sum_loan=''*SUM='PCT of sum loans'*f=percent7.2./row=float box='Statistic';
run;
Note that your format just reformats all values to themselves, and is therefore useless.
Try keyword colPctSum.
Data rawdata;
input ID quarter_year$ year Sum_Loan sum_profit score;
cards;
1 Q1-2017 2017 10 1 2
2 Q1-2017 2017 20 2 2
3 Q1-2017 2017 30 3 3
4 Q1-2017 2017 40 4 3
5 Q1-2017 2017 50 5 4
6 Q2-2017 2017 60 6 2
7 Q2-2017 2017 70 7 4
8 Q2-2017 2017 80 8 3
9 Q3-2017 2017 90 9 4
10 Q3-2017 2017 100 1 2
11 Q3-2017 2017 110 2 3
12 Q3-2017 2017 120 3 3
13 Q3-2017 2017 130 4 3
14 Q3-2017 2017 140 5 3
15 Q3-2017 2017 150 6 4
16 Q3-2017 2017 160 7 2
17 Q3-2017 2017 170 8 4
18 Q3-2017 2017 180 9 2
19 Q3-2017 2017 190 10 2
20 Q3-2017 2017 200 11 2
;
run;
proc format library=work;
value $qy (multilabel notsorted)
'Q1-2017'='Q1-2017'
'Q2-2017'='Q2-2017'
'Q3-2017'='Q3-2017'
'Q4-2017'='Q4-2017'
;
run;
proc tabulate data=rawdata;
class score quarter_year /mlf;
format quarter_year $qy.;
var Sum_Loan sum_profit;
table score='',quarter_year*sum_loan=''* sum='Sum of loans'*f=best8.
quarter_year*sum_loan=''* N='Number of loans'*f=best8.
quarter_year*sum_loan=''* colPCTSUM='PCT of sum loans'*f=best8.
/row=float box='Statistic'
;
run;
I think this may be what you are looking for, I did add row and column totals that you may not want.
proc tabulate data=rawdata;
class score quarter_year /mlf;
var Sum_Loan sum_profit;
table score='' all='Total',
quarter_year='Quarter'*
(sum_loan=''* sum='Loan sum'*f=best8. sum_loan=''*PCTSUM<quarter_year>='Pct Quarter' sum_loan=''* N='Number of loans'*f=best8.)
sum_loan=''* sum='Total Loan sum'*f=best8. sum_loan=''*PCTSUM='Total Pct Quarter' sum_loan=''* N='Total Number of loans'*f=best8.
/row=float box='Statistic' ;
run;
Which will give you this:
Statistic | Quarter | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Q1-2017 | Q2-2017 | Q3-2017 | ||||||||||
Loan sum | Pct Quarter | Number of loans | Loan sum | Pct Quarter | Number of loans | Loan sum | Pct Quarter | Number of loans | Total Loan sum | Total Pct Quarter | Total Number of loans |
|
2 | 30 | 3.26 | 2 | 60 | 6.52 | 1 | 830 | 90.22 | 5 | 920 | 43.81 | 8 |
3 | 70 | 10.77 | 2 | 80 | 12.31 | 1 | 500 | 76.92 | 4 | 650 | 30.95 | 7 |
4 | 50 | 9.43 | 1 | 70 | 13.21 | 1 | 410 | 77.36 | 3 | 530 | 25.24 | 5 |
Total | 150 | 7.14 | 5 | 210 | 10.00 | 3 | 1740 | 82.86 | 12 | 2100 | 100.00 | 20 |
Thank you but here the PCT is not per group
Can you please explain the using of symbol "<" in proc tabulate?
@Ronein wrote:
Can you please explain the using of symbol "<" in proc tabulate?
Look at the section called
Percect!
This is the solution I searched for
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.