BookmarkSubscribeRSS Feed
mitch
Calcite | Level 5
Hi. I'm trying to compare two sets of data. The common field that i'm using for comparison is the ID field. I'd like to be able to identify which ID's are duplicates. I think i could possibly use proc sort and nodupkey... but that would delete the observations instead of identifying them.
i've used proc compare but it only seems to compare the variables not the observations... any ideas? Here's my compare code:

proc compare base = work.A compare = work.B;
id IDCODE;
run;
2 REPLIES 2
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
PROC SORT has a DUPOUT= parameter so you can re-direct the duplicates to a different file. The other option, depending on your needs, is to use a DATA step, with a BY statement, and use the IF statement test for FIRST.ID and LAST.ID in order to perform some desired processing logic.

Scott Barry
SBBWorks, Inc.
mitch
Calcite | Level 5
Thanks a lot! I ended up breaking into PROC SQL and joining the two datasets then using ODS to outsheet the dups. I'm playing with PROC SORT Dupout so I can know how to use it for future.

I appreciate your suggestions.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 609 views
  • 0 likes
  • 2 in conversation