BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
abdulla
Pyrite | Level 9

I am using the following code to left join

proc sql;
create table want as
select *
from final3 left join gov_proxy
on final3.gvkey=gov_proxy.gvkey and
final3.fyear=gov_proxy.fyear;
quit;

I have 63,654 observations in my final3 dataset and 

I have 186679 observations in my gov-proxy dataset 

I want to have 63,654 dataset but I am getting 63888 dataset after deleting the duprecs after merging. 

I have no duprecs in any one of the dataset before merging. 

Can anyone help me with this? 

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

This isn't a duplicate record problem, it is a duplicate key problem. Your join results indicate that you either have duplicate key values in either final3 or gov_proxy or both. You can test this easily:

proc sort data = final3 out = dupkey nodupkey;
  by gvkey fyear;
run;

If dupkey has less records than final3 then you know it contains duplicate key values. Then the question is should the data have duplicate key values and if not how do you wish to fix this?

View solution in original post

3 REPLIES 3
PGStats
Opal | Level 21

You may not have any duplicate records, but you have at least 234 duplicate keys (gvkey-fyear) in gov_proxy.

PG
SASKiwi
PROC Star

This isn't a duplicate record problem, it is a duplicate key problem. Your join results indicate that you either have duplicate key values in either final3 or gov_proxy or both. You can test this easily:

proc sort data = final3 out = dupkey nodupkey;
  by gvkey fyear;
run;

If dupkey has less records than final3 then you know it contains duplicate key values. Then the question is should the data have duplicate key values and if not how do you wish to fix this?

Ksharp
Super User
Or you could have missing value in gvkey or fyear .

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1342 views
  • 4 likes
  • 4 in conversation