BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Reeza
Super User

You joined on the wrong fields initially. When you merge you need to understand how to uniquely identify records and you didn't identify a unique record correctly, which is why it was a many to many merge. With the new key, date, you've uniquely identified each record. 

 

I suggest taking a smaller subset, merging it and looking at some records manually to understand how this happened. 

 

All the explaining we do won't be as effective as you figuring out how to test it and look at it. 

 

It helps if you always try to understand the problem contextually before you program it and of course, know thy data is the golden rule of analysis.

 

 

 

 

 

Satish_Parida
Lapis Lazuli | Level 10

Something from Banking Domain:
1. Policy are renewed all the time, and in database we create a new record with a new start date for the renewed policy number.
2. IN Database we see Policy as Service, so we usually have a service serno assigned to each record which is usually the Primary key.
3. If you do not know the Primary key then you can certainly use Policy number, Start date/End Date and policy type together for an accurate result.

Reeza
Super User

You start with :

 

FEB_NBRS_WITHOUT_DT -> 5,918,065

SOURCE_33_COUNT -> 5,902,253

 

Your assumption: 5, 918,065 - 5902, 253 = 15,812 records that are not in the first data set.

 

The total number of records remains 5,918,065, of those:

  • 5,905,885 are in both data sets
  • 12,180 records are not in the first data set

 

I suggest you post some sample data, smaller fake data so we can illustrate how this can happen, but basically when you have a  many to many merge SAS doesn't merge properly and you need to use a SQL merge instead to fix this. 

 

 

Babloo
Rhodochrosite | Level 12
5905885 is less than the number of records in source_33_count dataset. How
it is possible?
Reeza
Super User

Can you please add some context to that statement. Pretend I can’t see your computer, data, or code and have no idea what you’re talking about. 

 


@Babloo wrote:
5905885 is less than the number of records in source_33_count dataset. How
it is possible?

 

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 19 replies
  • 9995 views
  • 7 likes
  • 6 in conversation