BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Paul_NYS
Obsidian | Level 7

Hi Everyone

I have the below very simple proc sort and one merge statement. In the second proc statement, there is a group of records that is not being included in the final merge data step. I used this code elsewhere with different data sets and it worked fine.

I know I did not include the data sets to review, but is there something here syntax-wise that would eliminate certain records?

Paul

proc sort data=s1ageOutALLAGEALL out=s1ageOutSortALLAGEALL;

by cnty_name startyear;

run;

proc sort data=s1AgeCleanALLAGEALL out=s1AgeCleanSortALLAGEALL;

by cnty_name startyear;

run;

data s1AgeCleanOutALLAGEALL;

merge s1AgeCleanSortALLAGEALL s1ageOutSortALLAGEALL;

by cnty_name startyear;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

I think if you merge and use the in option you can find the reason why....

data s1AgeCleanOutALLAGEALL;

merge s1AgeCleanSortALLAGEALL (in=clean) s1ageOutSortALLAGEALL (in=all) indsname=source;

by cnty_name startyear;

if clean or all;

source_file=source;

run;

View solution in original post

9 REPLIES 9
Haikuo
Onyx | Level 15

Hi,

Your code looks fine to me. I suspect that the BY variables of that last group got no match. Sometimes they look same while they are NOT. Your could try using hex. format to find the hidden culprit.

Haikuo

Reeza
Super User

I think if you merge and use the in option you can find the reason why....

data s1AgeCleanOutALLAGEALL;

merge s1AgeCleanSortALLAGEALL (in=clean) s1ageOutSortALLAGEALL (in=all) indsname=source;

by cnty_name startyear;

if clean or all;

source_file=source;

run;

Paul_NYS
Obsidian | Level 7

I ran the below code with the 'in' statements and the missing records were were still not in the final record set:

data s1AgeCleanOutALLAGEALL;

merge s1AgeCleanSortALLAGEALL (in=clean) s1ageOutSortALLAGEALL (in=all);

by cnty_name startyear;

if clean;

if all;

run;

Reeza
Super User

You need to have if clean or all;

not if clean; if all;

The first one selects records in either dataset, the second only grabs records in both datasets.

Paul_NYS
Obsidian | Level 7

It didn't work that way either. What is happening is that one set of records is overwriting the 'missing' ones. I don't understand why though, but I am trying a merge after the merge.

Paul

Reeza
Super User

I think you need to provide more info.

Something to check:

Do you have unique records per county/year in each file or do both have multiple county/years?

For the records that aren't being included you say they're being overwritten...that means they're being included somehow, but not what you want?

Does SQL give you what you want:

proc sql;

create table merged as

select * from

s1ageOutALLAGEALL as a

join s1AgeCleanALLAGEALL as b

on a.cnty_name=b.cnty_name

and a.year=b.year;

quit;

Paul_NYS
Obsidian | Level 7

If there was no match, there would be other records missing also. Its just this one group.

art297
Opal | Level 21

Can you post a copy of your log?

SASKiwi
PROC Star

One reason why you may not see all records in your MERGE is if cnty_name and startyear BY variables do not uniquely identify the rows in at least one of your input tables. MERGE cannot handle a many-to-many join, only one-to-one or one-to-many.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1763 views
  • 0 likes
  • 5 in conversation