Hi All,
I want to avoid sorting while doing merge because my below tables are already in sorting order. I think index will help but I am not sure.
or is there any method if the tables are already in sorting order then merge or sort kind of approach.
proc sort data=a ;
by id;
run;
proc sort data=b;
by id;
run;
data b;
merge a(in=a) b(in=b);
by id;
if a=b ;
run;
Thanks,
SS
@sathya66 wrote:
but it is showing an error.
ERROR: BY variables are not properly sorted on data set WORK.A.
Then your dataset is NOT sorted.
If your data sets are already in order, you don't need to run PROC SORT. Just proceed directly to the DATA step with MERGE.
The BY statement requires a data set that is in order. It doesn't matter how the data set came to be in order. It does not require sorting, if the data set is already in order.
If you are getting that error message, it means the observations are not in order. Of course you need to run PROC SORT when the observations are not in order. I thought you were asking if you could skip the PROC SORT when the data set was already in order.
@sathya66 wrote:
but it is showing an error.
ERROR: BY variables are not properly sorted on data set WORK.A.
Then your dataset is NOT sorted.
Maxim 2: Read the log. If the tables are already sorted as you need them, this will show in the log. Then no additional sorting needs to be done.
Indexes only improve performance if they can be used to select small subsets of data; in whole-dataset joins like yours they usually worsen overall performance.
Adding the option "presorted" to proc sort, will prevent datasets from being sorted if they are already sorted.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.