BookmarkSubscribeRSS Feed
Sean_OConnor
Fluorite | Level 6

Folks,

 

I've merged two datasets and have some reason duplicates corrected within it. I would like to know where these duplicates exist. Thus I've used the following piece of code;

 

proc sort data=accs dupout=dups nodupkey out=correct; by idinternal customernumber idoccurrence;
RUN;

In the accs dataset I've 315,050 observations and in the correct dataset I've 312,326. However, the dups dataset is empty. 

 

So it would appear that I've circa 2,700 duplicate observations but they are no where to be seen? 

 

Could somone shed some light on the this issue, please? 

2 REPLIES 2
thomp7050
Pyrite | Level 9

 

Try this, to view the total occurrences for each permutation of your variables:

 

PROC SQL; 
CREATE TABLE TOTALOCCURRENCE AS
SELECT IDINTERNAL, CUSTOMERNUMBER, IDOCCURRENCE, COUNT(*) AS TOTAL FROM ACCS GROUP BY IDINTERNAL, CUSTOMERNUMBER, IDOCCURRENCE;
QUIT;

Then, after viewing, if you would like a dataset of distinct occurrences, you could write:

 

PROC SQL;
CREATE TABLE ALLDISTINCT AS
SELECT DISTINCT IDINTERNAL, CUSTOMERNUMBER, IDOCCURRENCE FROM ACCS;
QUIT;
Astounding
PROC Star

It looks like NODUPKEY is kicking in, removing duplicates before the DUPOUT= option can examine.  Try removing NODUPKEY and see if that resolves the problem.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 3692 views
  • 1 like
  • 3 in conversation