BookmarkSubscribeRSS Feed
mitch
Calcite | Level 5
Hi. I'm trying to compare two sets of data. The common field that i'm using for comparison is the ID field. I'd like to be able to identify which ID's are duplicates. I think i could possibly use proc sort and nodupkey... but that would delete the observations instead of identifying them.
i've used proc compare but it only seems to compare the variables not the observations... any ideas? Here's my compare code:

proc compare base = work.A compare = work.B;
id IDCODE;
run;
2 REPLIES 2
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
PROC SORT has a DUPOUT= parameter so you can re-direct the duplicates to a different file. The other option, depending on your needs, is to use a DATA step, with a BY statement, and use the IF statement test for FIRST.ID and LAST.ID in order to perform some desired processing logic.

Scott Barry
SBBWorks, Inc.
mitch
Calcite | Level 5
Thanks a lot! I ended up breaking into PROC SQL and joining the two datasets then using ODS to outsheet the dups. I'm playing with PROC SORT Dupout so I can know how to use it for future.

I appreciate your suggestions.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1046 views
  • 0 likes
  • 2 in conversation