We’re smarter together. Learn from this collection of community knowledge and add your expertise.

Custom Sorting with Proc SQL

by Super Contributor on ‎07-05-2017 07:26 PM (3,090 Views)



In any survey of which SAS procedures are used the most it’s a pretty fair bet that Proc Sort and Proc SQL will be, if not at the top, then certainly very close to the top of the list. We simply want our data sorted whether it’s for further By Group processing, printing reports or on-screen presentation and normally Proc Sort or Proc SQL (through its Order By clause) will do that quite simply for us. However, there are occasions when a simple ascending or descending sort doesn’t quite fit what we need, particularly if we’re preparing data for printing or on-screen display. This article will present three ways of producing specific custom sorts which don’t follow the standard sort orders.


Custom Sorting with the Case Statement


For our first example, we’ll take a scenario where we want one or more groups of records to appear in a specific order with the remainder in a natural ascending order. We can achieve this by using a Case Statement inside our Order By clause in Proc SQL. A case statement works like an if-then-else statement and can be used to generate values which will then be used in the sort instead of the actual record values.


If, for example, if we are doing an analysis of the SASHELP.SHOES data set and look at the result of the following query we can see that there are ten distinct values in the region field.



proc sql;
	select distinct region
	from sashelp.shoes;


shoes distinct regions.png


If we want to group the two North American regions (United States and Canada) together and have United States first in the table with Canada second and others in their natural ascending order we cannot do it simply by an ascending or descending sort. Instead we can use proc SQL with an Order By clause and include a Case statement as follows


proc sql;
	create table sortshoes
	as select *
	from sashelp.shoes
	order by case region when "United States" then 0 when "Canada" then 1 else 2 end,


When Proc SQL carries out the sort and it encounters a record where the value of Region is “United States” that value will be treated as a zero, where the value is Canada that will be treated as a one and any other value will be treated as a two. A sort of these replacement values takes place giving the result below


shoes first sort A.png


shoes first sort B.png


Custom Sorting with Varying Secondary Sort Columns


Of course, it’s entirely possible to sort by more than one column by having primary and secondary sort columns specified in your Order By clause. It’s also possible, using a case statement, to vary the secondary column depending on the value found in the primary column.


This time we’ll assume we’re doing an analysis of the SASHELP.CARS data set and we want to sort by the type variable as the primary column and we want SUVs to be further sorted by mpg_city and any other type sorted by invoice. This time we can use the Case statement like so



proc sql;
	create table sortcars
	as select make, type, invoice, mpg_city
	from sashelp.cars
	order by type, case type when "SUV" then mpg_city else invoice end;


Our data set will then look like this - note that SUV is sorted before Sedan because of the way capitals are sorted and that Hybrids are sorted by Invoice and SUVs sorted by MPG_City


Cars Sort.png


Custom Sorting by a Substring Within a Column


Normally you want to sort by the full value of a column but just occasionally you might want to sort by a substring within the column. For this example, we’ll create a file containing the names of all British Prime Ministers from 1945 onwards



data prime_ministers1;
	length name $30;
	input name $;
	infile datalines;


You will see that there is only one column of names with surname first, then a comma and forename last. If I want to sort this by forename I can do so by using the scan function. You should note that because one of the names (Alec Douglas-Home) is double-barrelled I have had to use a modifier to ensure the name is split only on the comma and not the hyphen.



proc sql;
	create table prime_ministers2
	as select *
	from prime_ministers1
	order by scan(name,2,",");


After the Proc SQL runs the order is


Prime Ministers.png




As I mentioned in the introduction normally straight ascending or descending sorts will give you what you want but just occasionally you will come up against a scenario where they won’t. In these cases, the flexibility afforded by the Proc SQL Order By clause can come to the rescue and turn what seems like a complex problem into something quite straightforward.


If you have any comments or suggestions for other types of sort please leave them in the comments below, I’d love to hear from you.

by Contributor thesasuser
on ‎08-06-2017 10:55 PM

Excellent writeup

by Contributor tomrvincent
on ‎08-07-2017 08:29 AM

this fixed the sort bug.


order by lower(type)

Your turn
Sign In!

Want to write an article? Sign in with your profile.

Looking for the Ask the Expert series? Find it in its new home: communities.sas.com/askexpert.