BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Reeza
Super User

You joined on the wrong fields initially. When you merge you need to understand how to uniquely identify records and you didn't identify a unique record correctly, which is why it was a many to many merge. With the new key, date, you've uniquely identified each record. 

 

I suggest taking a smaller subset, merging it and looking at some records manually to understand how this happened. 

 

All the explaining we do won't be as effective as you figuring out how to test it and look at it. 

 

It helps if you always try to understand the problem contextually before you program it and of course, know thy data is the golden rule of analysis.

 

 

 

 

 

Satish_Parida
Lapis Lazuli | Level 10

Something from Banking Domain:
1. Policy are renewed all the time, and in database we create a new record with a new start date for the renewed policy number.
2. IN Database we see Policy as Service, so we usually have a service serno assigned to each record which is usually the Primary key.
3. If you do not know the Primary key then you can certainly use Policy number, Start date/End Date and policy type together for an accurate result.

Reeza
Super User

You start with :

 

FEB_NBRS_WITHOUT_DT -> 5,918,065

SOURCE_33_COUNT -> 5,902,253

 

Your assumption: 5, 918,065 - 5902, 253 = 15,812 records that are not in the first data set.

 

The total number of records remains 5,918,065, of those:

  • 5,905,885 are in both data sets
  • 12,180 records are not in the first data set

 

I suggest you post some sample data, smaller fake data so we can illustrate how this can happen, but basically when you have a  many to many merge SAS doesn't merge properly and you need to use a SQL merge instead to fix this. 

 

 

Babloo
Rhodochrosite | Level 12
5905885 is less than the number of records in source_33_count dataset. How
it is possible?
Reeza
Super User

Can you please add some context to that statement. Pretend I can’t see your computer, data, or code and have no idea what you’re talking about. 

 


@Babloo wrote:
5905885 is less than the number of records in source_33_count dataset. How
it is possible?

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 19 replies
  • 9753 views
  • 7 likes
  • 6 in conversation