do we have to always sort by _ALL_ to use noduprecs?
@veda8 wrote:
do we have to always sort by _ALL_ to use noduprecs?
You don't "have" to, SAS will happily let you use any subset of the variables in the BY statement.
But if you want the result to eliminate all duplicate records you do.
The reason is that the DUP check (or as it has been renamed the DUPRECS check) only compares adjacent records. So if you only sort by a subset of the variables then it is possible for two records that are exactly the same to be output. They just need at least one observation that is different on some non-key (by) variable in between them.
No:
data have;
input ID var;
datalines;
1 10
1 20
1 10
3 50
3 50
3 50
2 30
2 30
2 40
;
proc sort data=have noduprec;
by ID;
run;
when use nodupkey and give two variables in dupkey
eg :
by id var;
which var(s) is considered as dupkey?
@veda8 wrote:
when use nodupkey and give two variables in dupkey
eg :
by id var;
which var(s) is considered as dupkey?
Both.
The "key" is whatever is on the By statement.
NODUPKEY
checks for and eliminates observations with duplicate BY values. If you specify this option, then PROC SORT compares all BY values for each observation to the ones for the previous observation that is written to the output data set. If an exact match is found, then the observation is not written to the output data set.
@veda8 wrote:
do we have to always sort by _ALL_ to use noduprecs?
You don't "have" to, SAS will happily let you use any subset of the variables in the BY statement.
But if you want the result to eliminate all duplicate records you do.
The reason is that the DUP check (or as it has been renamed the DUPRECS check) only compares adjacent records. So if you only sort by a subset of the variables then it is possible for two records that are exactly the same to be output. They just need at least one observation that is different on some non-key (by) variable in between them.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.