@Ronein wrote: Sometimes there are some problems in data and have duplications. Is it better to select all rows in the query that create sas data set from tera table (not using distinct) and only then use sas proc sort nodupkey?
That's what I would do; PROC SORT in SAS is usually the quickest way, unless you can fit the data into memory (hash object).