Hi there,
So you can use the NODUPKEY option to remove the duplicate observations by using a BY statement with the keyword _ALL_. You can then also use the DUPOUT= option to then capture those removed observations so you can report the before and after. The following method also does not overwrite the original dataset.
PROC SORT
DATA = *original dataset*
NODUPKEY
OUT = *new dataset with removed duplications*
DUPOUT = *new dataset with removed observations*
BY _ALL_;
RUN;
I hope this helps!
Mady