If the dataset is big, then this may offer some efficiencies. It passes through the data only once:
data need/ view=need;
set have;
new1=col1;new2=col2;
call sortc(new1,new2);
run;
proc sort data=need out=want nodupkey;
by new1 new2;
run;
It takes advantage of the fact that data set need is a data set VIEW, not a data set FILE. This means it is actualized only when called later (by the proc sort) its data is passed directly to proc sort without being written to disk.
Result, one pass of the data, producing sorted results with all duplicate combinations (not just duplicate permutations) of col1/col2 removed.
... View more