BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
abdulla
Pyrite | Level 9

I am using the following code to left join

proc sql;
create table want as
select *
from final3 left join gov_proxy
on final3.gvkey=gov_proxy.gvkey and
final3.fyear=gov_proxy.fyear;
quit;

I have 63,654 observations in my final3 dataset and 

I have 186679 observations in my gov-proxy dataset 

I want to have 63,654 dataset but I am getting 63888 dataset after deleting the duprecs after merging. 

I have no duprecs in any one of the dataset before merging. 

Can anyone help me with this? 

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

This isn't a duplicate record problem, it is a duplicate key problem. Your join results indicate that you either have duplicate key values in either final3 or gov_proxy or both. You can test this easily:

proc sort data = final3 out = dupkey nodupkey;
  by gvkey fyear;
run;

If dupkey has less records than final3 then you know it contains duplicate key values. Then the question is should the data have duplicate key values and if not how do you wish to fix this?

View solution in original post

3 REPLIES 3
PGStats
Opal | Level 21

You may not have any duplicate records, but you have at least 234 duplicate keys (gvkey-fyear) in gov_proxy.

PG
SASKiwi
PROC Star

This isn't a duplicate record problem, it is a duplicate key problem. Your join results indicate that you either have duplicate key values in either final3 or gov_proxy or both. You can test this easily:

proc sort data = final3 out = dupkey nodupkey;
  by gvkey fyear;
run;

If dupkey has less records than final3 then you know it contains duplicate key values. Then the question is should the data have duplicate key values and if not how do you wish to fix this?

Ksharp
Super User
Or you could have missing value in gvkey or fyear .

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1717 views
  • 4 likes
  • 4 in conversation