BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ronein
Meteorite | Level 14

Hello

In the following report that is created by proc summary I want to control the order of the categories (levels) of  invoice   .

I want that the order will be same as mentioned in format $InvoiceorderFmt (in right side)

what is the way to do it please?

 

proc format;
value InvoiceFmt
0 <- 1000='0-1000'
1000 <- 5000='1,000-5,000'
5000 <- 10000='5,000-1,0000'
10000 <- 20000='1,0000-2,0000'
20000 <- 50000='20,000-50,000'
50000 <- 100000='50,000-100,000'
100000 <- high='100,000+';
value $InvoiceorderFmt
'0-1000'='1'
'1,000-5,000'='2'
'5,000-1,0000'='3'
'1,0000-2,0000'='4'
'20,000-50,000'='5'
'50,000-100,000'='6'
'100,000+'='7'
;
run;

Data cars;
set sashelp.cars;
Invoice_=put(Invoice,InvoiceFmt.);
run;

proc summary data=cars  nway missing ;
class Type Invoice_ ;
var Invoice;
output out=summaryTbl(drop=_type_) sum= mean= /autoname;
run;

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Jagadishkatam
Amethyst | Level 16

Please try the completetypes and preloadfmt option

 

proc summary data=cars  nway missing completetypes;
format Invoice InvoiceFmt.;
class Type Invoice / preloadfmt;
var Invoice;
output out=summaryTbl2(drop=_type_) sum= mean= /autoname;
run;
Thanks,
Jag

View solution in original post

9 REPLIES 9
Jagadishkatam
Amethyst | Level 16

Please try the completetypes and preloadfmt option

 

proc summary data=cars  nway missing completetypes;
format Invoice InvoiceFmt.;
class Type Invoice / preloadfmt;
var Invoice;
output out=summaryTbl2(drop=_type_) sum= mean= /autoname;
run;
Thanks,
Jag
Ronein
Meteorite | Level 14

Thanks .

I want to ask a few questions please:

1-By general I  knew that when we use same variable in class statement and in VAR statement (column invoice in our example) then we need to create another variable.

for example:

data cars;

set sashelp.cars;

invoice_=invoice;

run;

 

I saw in your solution that you use same variable in class statement  and VAR  statement and it is working very well. I want to ask how is it working?

 

Should you add any option to proc format when using completetypes+preloadfmt?

How the options completetypes+preloadfmt helping to create the table correctly?

proc format;
value  InvoiceFmt
0<-20000='0-20k'
20000<-30000='20-30k'
30000<-50000='30-50k'
50000<-high='50k+'
;
run;
/*Results are good*/
proc summary data=sashelp.cars  nway missing completetypes;
format Invoice InvoiceFmt.;
class Type Invoice / preloadfmt;
var Invoice;
output out=summaryTbl2(drop=_type_) sum= mean= /autoname;
run;
/*Results are not good: Invoice_sum and invoice_mean are not calculated!*/ proc summary data=sashelp.cars nway missing; format Invoice InvoiceFmt.; class Type Invoice ; var Invoice; output out=summaryTbl2(drop=_type_) sum= mean= /autoname; run;

 

Jagadishkatam
Amethyst | Level 16
When we create table reports, most of the time we encounter a missing category or a missing treatment in our data.To handle missing categories or groups efficiently, we can use format and means with options completetypes and preloadfmt. Using these options help us to display missing values with zeros and display a missing treatment or group in the table output.
Thanks,
Jag
PaigeMiller
Diamond | Level 26

@Jagadishkatam wrote:
When we create table reports, most of the time we encounter a missing category or a missing treatment in our data.To handle missing categories or groups efficiently, we can use format and means with options completetypes and preloadfmt. Using these options help us to display missing values with zeros and display a missing treatment or group in the table output.

I don't understand this. The question was not about missing values or missing categories, the question was about the order of the categories in the output. Are you saying that completetypes and preloadfmt affect the order of the categories?


When I run this code, without completetypes and preloadfmt, the categories are still sorted in the desired order, but the variables invoice_sum and invoice_mean are formatted, which I don't think is desired. Whereas the solution I provided above doesn't have this drawback of having the invoice_sum and invoice_mean formatted.

--
Paige Miller
Ronein
Meteorite | Level 14

Is there another way or just these 3 ways that I show here?

Is it possible to use multilabel in proc format in order to get the desired order of rows in summary table?

In Way3, Is it essential to use proc sort ?Can you please explain why?

Can you please explain how proc summary sort the rows by default?

why adding  the format  format oorder Invoice4Fmt. helps to sort the rows?

 

 

/*Way1-Define the format in a way that will put the rows in order*/
proc format;
value Invoice1Fmt
0 <- 20000='(a) 0-20,000'
20000 <- 30000='(b) 20,000-30,000'
30000 <- 50000='(c) 30,000-50,000'
50000 <- high='(d) 50,000+';
Run;
Data cars;
set sashelp.cars;
Invoice_=put(Invoice,Invoice1Fmt.);
run;
proc summary data=cars  nway missing ;
class Type Invoice_ ;
var Invoice;
output out=summaryTbl(drop=_type_) sum= mean= /autoname;
run;

/*Way2-padding blanks before it*/
proc format;
value Invoice2Fmt
0 <- 20000=    '   0-20,000'
20000 <- 30000='  20,000-30,000'
30000 <- 50000=' 30,000-50,000'
50000 <- high= '50,000+';
Run;
Data cars;
set sashelp.cars;
Invoice_=put(Invoice,Invoice2Fmt.);
run;
proc summary data=cars  nway missing ;
class Type Invoice_ ;
var Invoice;
output out=summaryTbl(drop=_type_) sum= mean= /autoname;
run;

/*Way3*/
proc format;
value Invoice3Fmt
0 <- 20000='1'
20000 <- 30000='2'
30000 <- 50000='3'
50000 <- high='4';
value Invoice4Fmt
1='0-20,000'
2='20,000-30,000'
3='30,000-50,000'
4='50,000+';
run;
Data cars;
set sashelp.cars;
oorder =input(put(Invoice,Invoice3Fmt.),2.);
run;
/*proc sort data=cars; by oorder; run;*/
proc summary data=cars  nway missing ;
class Type oorder ;
format oorder Invoice4Fmt.;
var Invoice;
output out=summaryTbl(drop=_type_) sum= mean= /autoname;
run;

 
PaigeMiller
Diamond | Level 26

@Ronein wrote:

Hello

In the following report that is created by proc summary I want to control the order of the categories (levels) of  invoice   .

I want that the order will be same as mentioned in format $InvoiceorderFmt (in right side)

what is the way to do it please?

 

proc format;
value InvoiceFmt
0 <- 1000='0-1000'
1000 <- 5000='1,000-5,000'
5000 <- 10000='5,000-1,0000'
10000 <- 20000='1,0000-2,0000'
20000 <- 50000='20,000-50,000'
50000 <- 100000='50,000-100,000'
100000 <- high='100,000+';
value $InvoiceorderFmt
'0-1000'='1'
'1,000-5,000'='2'
'5,000-1,0000'='3'
'1,0000-2,0000'='4'
'20,000-50,000'='5'
'50,000-100,000'='6'
'100,000+'='7'
;
run;

Data cars;
set sashelp.cars;
Invoice_=put(Invoice,InvoiceFmt.);
run;

proc summary data=cars  nway missing ;
class Type Invoice_ ;
var Invoice;
output out=summaryTbl(drop=_type_) sum= mean= /autoname;
run;

 

 


If you create a character variable INVOICE_, then SAS will use alphabetical order to order the levels. That's how SAS treats character variables, it will always put them in alphabetical order. Since the alphabetical order puts anything starting with a 0 first, followed by all levels starting with 1s next, you get an undesired ordering, one that doesn't make sense for numeric values such as Invoice amount.

 

If you create a NUMERIC variable INVOICE_ (and format it), then SAS keeps the levels in NUMERIC order. That's how SAS treats numeric variables, it will always put them in numerical order (regardless of formatting). Then the first level is 0-1000 and the next level is 1000-5000 and the next level is 5000-10000 and so on.

 

So this is a case where you want numeric levels to be in a numeric variable. In fact, its hard to think of a case where you would want numeric 1000 to be converted to character 1000. So don't do it, leave your numeric variables as numerical variables (with whatever formatting you want), and then you won't have to jump through hoops to force character variables to sort as if they were numeric.

 

Data cars;
set sashelp.cars;
Invoice_=Invoice; /* Invoice_ is a numeric variable */
format invoice_ invoicefmt.;
run;

proc summary data=cars  nway missing ;
class Type Invoice_ ;
var Invoice;
output out=summaryTbl(drop=_type_) sum= mean= /autoname;
run;

 

--
Paige Miller
ballardw
Super User

While @PaigeMiller's solution is likely best the bit about the character values and sort order does offer another solution. Define your format so that the character sort order works as desired. Spaces will sort before numerals, so introducing enough spaces in the right places will change the sort order of the formatted values.

proc format;
value InvoiceFmt
     0 <- 1000   ='  0-1000'
  1000 <- 5000   ='  1,000-5,000'
  5000 <- 10000  ='  5,000-10,000'
 10000 <- 20000  =' 10,000-20,000'
 20000 <- 50000  =' 20,000-50,000'
 50000 <- 100000 =' 50,000-100,000'
100000 <- high   ='100,000+';
run;

Note that your as posted definition for the 5000 to 10000 range didn't really make sense with the comma placements, which will affect sort order also, so I fixed that above.

And formatting code helps as well.

 

@PaigeMiller's solution is more flexible as with the two numeric variables you can create new groups with a changed format. Adding a character variable means that you would have to recreate or add an additional character variable if you want to see what the result changes might be with a different grouping.

Also, the data step does not understand a multilabel format. So adding a character value using multilabel format would only have the highest level of the format.

 

PaigeMiller
Diamond | Level 26

Your format with proper use of spaces before the numeric characters is a good example of what I called "jumping through hoops to make a character variable sort as numeric". Although it does work.

 

But anyway, I would not encourage this use of formatting to create character variables, so that they can sort as numeric; it seems like unnecessary effort to me.

--
Paige Miller
ballardw
Super User

@PaigeMiller wrote:

Your format with proper use of spaces before the numeric characters is a good example of what I called "jumping through hoops to make a character variable sort as numeric". Although it does work.

 

But anyway, I would not encourage this use of formatting to create character variables, so that they can sort as numeric; it seems like unnecessary effort to me.


Agree. I did note that the character variable is not the best approach.

I have found that sometimes care with the format definitions can alleviate a lot of other hoops for getting desired appearance order. The restrictions on using PRELOADFMT, in the places that is allowed, sometimes get other undesired appearance. The format definition also can address things with character values as well where the options like Order=internal that may work with numeric values would not help. And unless you use a style override for ASIS=on those leading spaces tend to disappear in the output most of the time.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1121 views
  • 2 likes
  • 4 in conversation