Hello
I want to create summary report where for each category I will calculate number of observations and sum of X.
It is important for me to control the order of the categories in the output (same order as it appear in proc format)
Please see below to ways (proc sql and proc report) that didn't work well because of wrong order of categories
I want to mention the in this example the categorical field and the summary field are same fields (X).
DATA RawData;
do i = 1 to 100000;
x =5+1000*rannor(3452083);
output;
end;
RUN;
proc format;
value ffmt
low -< -4000 = 'Lower than -4K'
-4000 -< -3000 = '[-4,-3)'
-3000 -< -2000 = '[-3,-2)'
-2000 -< -1000 = '[-2,-1)'
-1000 -< 0 = '[-1,0)'
0 -< 1000 = '[0,1)'
1000 -< 2000 = '[1,2)'
2000 -< 3000 = '[2,3)'
3000 -< 4000 = '[3,4)'
4000 - high = '[4+'
;
run;
/*Way1*/
/*Order of X_Cat is not as I want here*/
PROC SQL;
create table summary1 as
select put(x,ffmt.)as X_Cat,
count(*) as No_Customers,
sum(x) as Sum_X
from tbl1
group by calculated X_Cat
;
QUIT;
/*Way2*/
/*proc report summarize by group*/
/*Order of X_Cat is not as I want here*/
data bbb;
set tbl1;
x_cat=put(x,ffmt.);
run;
title;
footnote;
proc report data=bbb nowd;
column x_cat X n;
define x_cat / group;
define n / "No_Customers" f=comma12.;
define X / sum "Sum_X" f=comma12.;
define n / "No_laks" f=comma12.;
run;
There are a number of areas of improvement.
You are using formats wrong. You are using them in a PUT statement to turn numbers into characters, this is entirely unnecessary and causes problems. The ordering of characters depends on how these characters sort in some sorting sequence, and may or may not give you the sorting that makes sense visually (in this case, the sorting is not an order that makes sense).
But if you leave the numbers as numbers (even if they are formatted — please note that formatted numbers are numbers even though your eyes see them as text), then SAS works with the numbers and these sort properly and in an order that makes sense, even when displayed visually to us humans using the formats.
In PROC REPORT you use the option ORDER=INTERNAL to make sure the categories appear in the ordering of the numbers ( a number < -4000 comes before numbers between -4000 and -3000, etc.), which is what you want
proc format;
value ffmt
low -< -4000 = 'Lower than -4K'
-4000 -< -3000 = '[-4,-3)'
-3000 -< -2000 = '[-3,-2)'
-2000 -< -1000 = '[-2,-1)'
-1000 -< 0 = '[-1,0)'
0 -< 1000 = '[0,1)'
1000 -< 2000 = '[1,2)'
2000 -< 3000 = '[2,3)'
3000 -< 4000 = '[3,4)'
4000 - high = '[4+'
;
run;
DATA RawData;
do i = 1 to 100000;
x =5+1000*rannor(3452083);
output;
end;
format x ffmt.;
RUN;
proc report data=rawdata nowd;
column X n;
define x / group order=internal;
run;
Perfect!
If I want also to calculate also:
1-sum of X for each category of X
2-PCT of SUM of X for each category of X
What is the way to do it please?
Please note that in this case X is both a categorical variable (We have different groups of X) and a continuous variable(Sum_X)
Thanks
Joe
@Ronein wrote:
Perfect!
If I want also to calculate also:
1-sum of X for each category of X
2-PCT of SUM of X for each category of X
What is the way to do it please?
Please note that in this case X is both a categorical variable (We have different groups of X) and a continuous variable(Sum_X)
Thanks
Joe
Answered in your other thread at https://communities.sas.com/t5/SAS-Programming/PROC-REPORT/td-p/583726
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.