BookmarkSubscribeRSS Feed
kansas
Calcite | Level 5
Hi ,
Is there a major difference in sorting by character variables vs. numeric variables?

Scenario.
1. Dataset having three numeric variables key1, key2 , key3
using "by key1 key2 key3"

2. Dataset having a concatenated variables of the three keys— big_key (key1-key2-key3) e.g. "23-34-56";
then using "by big_key";

Is there a performance difference ?
(Dataset contains around 1mil. records. )

Thanks
7 REPLIES 7
ieva
Pyrite | Level 9
Not sure, if it is created just for this sort, it sounds like an unnecessary variable that just makes your big data set even bigger.
Ksharp
Super User
I prefer to use 2 because it is faster.
Opps,I should point out that these two way is different.
Maybe you need to make some code to see how different they are.

Ksharp Message was edited by: Ksharp
kansas
Calcite | Level 5
ok. i shall re-phrase my question

i have a huge dataset with three key variables.
key1 key2 key3.
I need to sort by these three variables( mentioning all in the by statement)

whats the fastest way to sort.
1. using three numeric variables in the by statement
2. concatenate three variables and create a character variable. Then sorting by the character variable?
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Honestly, why don't you simply try a self-initiated experiment and report the results back to the forum.

Take a subset of your "huge" input data file and do a few sorts with different techniques.

Add OPTIONS like FULLSTIMER and also consider SORT options like EQUALS/NOEQUALS for performance consideration.

Scott Barry
SBBWorks, Inc.
kansas
Calcite | Level 5
actually i tried. And couldn't find any significant performance difference.
But there should be some theoretical evidence using which we can conclude whats the optimal.

Thanks Message was edited by: kansas
ballardw
Super User
The TAGSORT option might be something to investigate for improving performance of sorts on large sets.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 884 views
  • 0 likes
  • 5 in conversation