Hello
I want to ask what is the default way that SAS sort the rows (of class variable) in proc summary.
for example:
in this example order of rows is:
10-20K
100K+
20-50K
5-10K
50-100K
May anyone explain why the order is like that?
proc format;
value FF2T
0 <- 1000='0-1K'
1000 <- 5000='1-5K'
5000 <- 10000='5-10K'
10000 <- 20000='10-20K'
20000 <- 50000='20-50K'
50000 <- 100000='50-100K'
100000 <- high='100K+';
;
run;
Data Sedan_cars;
set sashelp.cars(where=(Type='Sedan'));
Invoice_=put(Invoice,FF2T.);
run;
proc summary data=Sedan_cars nway missing ;
class Type Invoice_ ;
var Invoice;
output out=summaryTbl(drop=_type_) sum= mean= /autoname;
run;
your format is a character and imagine this is from a dictionary then the order will be as follows and hence the discrepancy
1
10
100
11
110
2
20
21
3
30
300
31
etc.
in order for things to be consistent make them all two or three positional digits by padding with leading zeros. Good luck
Follow Maxim 1 and Read the Documentation, in this case, of the CLASS Statement. Hint: look for "Order".
This was all explained in a thread you started two weeks ago
If you create INVOICE_ as a character variable, it will sort as character variable (which sorts alphabetically), where strings that begin with 1 come before strings that begin as 2. You should not create INVOICE_ as a character variable and then try to jump through hoops to get it to sort as character.
If you create INVOICE_ as a numeric variable, it will sort as numeric, and sort in the desired order. My reply in that thread shows you exactly how to do it.
Summary: character variables sort alphabetically. Numeric variables will sort numerically. In this case, you want it to be a numeric variable.
You might try making a format that sorts in the order that you want things, such as:
proc format; value FF2T 0 <- 1000 =' 0-1K' 1000 <- 5000 =' 1-5K' 5000 <- 10000 =' 5-10K' 10000 <- 20000 =' 10-20K' 20000 <- 50000 =' 20-50K' 50000 <- 100000=' 50-100K' 100000 <- high ='100K+'; ;
Especially if you are going to insist on making character variables. If you use the numeric value directly then you can usually use the internal value to control order. But that choice is taken away with character variables, or at least it will behave quite differently.
Notice that nicely formatted code would show the "sort order" of the character values this way. The space will sort before 1, 2, 3 etc.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.