Solved: Re: HOW TO SEE DUPLICATES BY TWO VARIABLES IN A DATASET

d6k5d3 · Posted 12-10-2018 08:25 PM

I know if I code like:

proc sort data=zzz nodupkey;

by aa cc; run;

it will remove duplicate entries which have identical values in aa & cc. How can I see, instead of removing, the duplicates which got removed?

Much thanks.

Regards.

r_behata · Posted 12-10-2018 08:49 PM

proc sort data=zzz nodupkey dupout= xyz;
by aa cc; 
run;

View solution in original post

learsaas · Posted 12-10-2018 08:29 PM

dupout=

d6k5d3 · Posted 12-10-2018 08:42 PM

Sorry. I do not get how to use this. Where can I plug this in my code above?

r_behata · Posted 12-10-2018 08:49 PM

proc sort data=zzz nodupkey dupout= xyz;
by aa cc; 
run;

SASKiwi · Posted 12-10-2018 08:52 PM

Here is a link to the documentation for DUPOUT: https://documentation.sas.com/?docsetId=proc&docsetTarget=p02bhn81rn4u64n1b6l00ftdnxge.htm&docsetVer...

novinosrin · Posted 12-10-2018 09:20 PM

Hi @d6k5d3 While dupout and such options offer pretty straight forward solutions, I'd recommend to get intuitive knowledge on by group processing in SAS, SQL, Proc freq, and in general counting techniques. A grasp of that will make you comfortable. The idea is to differentiate unique records, unique keys etc. The more you dig, you will venture into other concepts such as indexes and beyond or in other words one understanding leads to the next. Have fun!

HOW TO SEE DUPLICATES BY TWO VARIABLES IN A DATASET

Re: HOW TO SEE DUPLICATES BY TWO VARIABLES IN A DATASET

Re: HOW TO SEE DUPLICATES BY TWO VARIABLES IN A DATASET

Re: HOW TO SEE DUPLICATES BY TWO VARIABLES IN A DATASET

Re: HOW TO SEE DUPLICATES BY TWO VARIABLES IN A DATASET

Re: HOW TO SEE DUPLICATES BY TWO VARIABLES IN A DATASET

Re: HOW TO SEE DUPLICATES BY TWO VARIABLES IN A DATASET

SAS Innovate 2025: Save the Date