Hi All,
I want to avoid sorting while doing merge because my below tables are already in sorting order. I think index will help but I am not sure.
or is there any method if the tables are already in sorting order then merge or sort kind of approach.
proc sort data=a ;
by id;
run;
proc sort data=b;
by id;
run;
data b;
merge a(in=a) b(in=b);
by id;
if a=b ;
run;
Thanks,
SS
@sathya66 wrote:
but it is showing an error.
ERROR: BY variables are not properly sorted on data set WORK.A.
Then your dataset is NOT sorted.
If your data sets are already in order, you don't need to run PROC SORT. Just proceed directly to the DATA step with MERGE.
The BY statement requires a data set that is in order. It doesn't matter how the data set came to be in order. It does not require sorting, if the data set is already in order.
If you are getting that error message, it means the observations are not in order. Of course you need to run PROC SORT when the observations are not in order. I thought you were asking if you could skip the PROC SORT when the data set was already in order.
@sathya66 wrote:
but it is showing an error.
ERROR: BY variables are not properly sorted on data set WORK.A.
Then your dataset is NOT sorted.
Maxim 2: Read the log. If the tables are already sorted as you need them, this will show in the log. Then no additional sorting needs to be done.
Indexes only improve performance if they can be used to select small subsets of data; in whole-dataset joins like yours they usually worsen overall performance.
Adding the option "presorted" to proc sort, will prevent datasets from being sorted if they are already sorted.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.