Hi All,
I want to avoid sorting while doing merge because my below tables are already in sorting order. I think index will help but I am not sure.
or is there any method if the tables are already in sorting order then merge or sort kind of approach.
proc sort data=a ;
by id;
run;
proc sort data=b;
by id;
run;
data b;
merge a(in=a) b(in=b);
by id;
if a=b ;
run;
Thanks,
SS
@sathya66 wrote:
but it is showing an error.
ERROR: BY variables are not properly sorted on data set WORK.A.
Then your dataset is NOT sorted.
If your data sets are already in order, you don't need to run PROC SORT. Just proceed directly to the DATA step with MERGE.
The BY statement requires a data set that is in order. It doesn't matter how the data set came to be in order. It does not require sorting, if the data set is already in order.
If you are getting that error message, it means the observations are not in order. Of course you need to run PROC SORT when the observations are not in order. I thought you were asking if you could skip the PROC SORT when the data set was already in order.
@sathya66 wrote:
but it is showing an error.
ERROR: BY variables are not properly sorted on data set WORK.A.
Then your dataset is NOT sorted.
Maxim 2: Read the log. If the tables are already sorted as you need them, this will show in the log. Then no additional sorting needs to be done.
Indexes only improve performance if they can be used to select small subsets of data; in whole-dataset joins like yours they usually worsen overall performance.
Adding the option "presorted" to proc sort, will prevent datasets from being sorted if they are already sorted.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.