BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Reeza
Super User

You joined on the wrong fields initially. When you merge you need to understand how to uniquely identify records and you didn't identify a unique record correctly, which is why it was a many to many merge. With the new key, date, you've uniquely identified each record. 

 

I suggest taking a smaller subset, merging it and looking at some records manually to understand how this happened. 

 

All the explaining we do won't be as effective as you figuring out how to test it and look at it. 

 

It helps if you always try to understand the problem contextually before you program it and of course, know thy data is the golden rule of analysis.

 

 

 

 

 

Satish_Parida
Lapis Lazuli | Level 10

Something from Banking Domain:
1. Policy are renewed all the time, and in database we create a new record with a new start date for the renewed policy number.
2. IN Database we see Policy as Service, so we usually have a service serno assigned to each record which is usually the Primary key.
3. If you do not know the Primary key then you can certainly use Policy number, Start date/End Date and policy type together for an accurate result.

Reeza
Super User

You start with :

 

FEB_NBRS_WITHOUT_DT -> 5,918,065

SOURCE_33_COUNT -> 5,902,253

 

Your assumption: 5, 918,065 - 5902, 253 = 15,812 records that are not in the first data set.

 

The total number of records remains 5,918,065, of those:

  • 5,905,885 are in both data sets
  • 12,180 records are not in the first data set

 

I suggest you post some sample data, smaller fake data so we can illustrate how this can happen, but basically when you have a  many to many merge SAS doesn't merge properly and you need to use a SQL merge instead to fix this. 

 

 

Babloo
Rhodochrosite | Level 12
5905885 is less than the number of records in source_33_count dataset. How
it is possible?
Reeza
Super User

Can you please add some context to that statement. Pretend I can’t see your computer, data, or code and have no idea what you’re talking about. 

 


@Babloo wrote:
5905885 is less than the number of records in source_33_count dataset. How
it is possible?

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 19 replies
  • 6158 views
  • 7 likes
  • 6 in conversation