BookmarkSubscribeRSS Feed
cadams47
Fluorite | Level 6

I use Proc Sql to Left Join in SAS all the time. But the last few months, I feel like the same code I always used makes the dataset explode instead of actually left join. I know this crazy. I am still just using SAS 9.4.

 

For example,

proc sql;
     create table wrds_cik6
     as select *, b.gvkey as comp_gvkey
     from wrds_cik5 as a left join wrds_cik_a as b
     on a.cik = b.cik;
quit;

 

wrds_cik5 only has about 180,000 observations. Yet, it returns 300,000,000 observations after this code. Why is it exploding? (I looked on this board for info to an almost identical question earlier, and the answer was, "left join has always done this.")

5 REPLIES 5
Reeza
Super User

You likely have duplicate CIK records in each table. If you have 5 CIK in Table1 and 3 in Table2 after the left join you'll have 15 records. You likely need to add another filter to the join or someway to reduce those duplicates.

 


@cadams47 wrote:

I use Proc Sql to Left Join in SAS all the time. But the last few months, I feel like the same code I always used makes the dataset explode instead of actually left join. I know this crazy. I am still just using SAS 9.4.

 

For example,

proc sql;
     create table wrds_cik6
     as select *, b.gvkey as comp_gvkey
     from wrds_cik5 as a left join wrds_cik_a as b
     on a.cik = b.cik;
quit;

 

wrds_cik5 only has about 180,000 observations. Yet, it returns 300,000,000 observations after this code. Why is it exploding? (I looked on this board for info to an almost identical question earlier, and the answer was, "left join has always done this.")


 

Urban_Science
Quartz | Level 8

Just like Reeza said, it appears that you have duplicate values for the joined variable in at least one of the tables that you are joining.

Here is some code to show this happening:

data work.have1;
	col1 = 1;
	col2 = 1;
	output;
	col1 = 2;
	col2 = 1;
	output;
	col1 = 3;
	col2 = 1;
	output;
run;

data work.have2;
	col1 = 1;
	col3 = 2;
	output;
	col1 = 2;
	col3 = 2;
	output;
	col1 = 4;
	col3 = 2;
	output;
run;


proc sql;
	create table work.want as 
		select *
			, b.col3
		from work.have1 as a
		left join work.have2 as b
			on a.col1 = b.col3
			;
quit;

Even though we are joining two tables with 3 rows each, the table created has 5 rows.

cadams47
Fluorite | Level 6

Ahhh. You're right. I have multiple cik's from my right-side table. Thank you both for the help.

ChrisNZ
Tourmaline | Level 20

Something like this *might* work around your issue, depending on your data.

proc sql;
  create table WRDS_CIK6
  as select c5.*, ca.GVKEY as COMP_GVKEY
  from WRDS_CIK5                                   c5 
         left join 
       (select unique CIK, GVKEY from WRDS_CIK_A)  ca
         on c5.CIK = ca.CIK;
quit;

 

ScottBass
Rhodochrosite | Level 12

When I run into this issue, I use this construct in SQL.  There are other ways in base SAS such as SORT or FREQ, but I usually revert to this construct because I work directly in the RDBMS a lot.

 

proc sql;
   create table test as
   select cik, count(0) as count
   from wrds_cik5
   group by cik
   having count(0) > 1;
quit;

proc sql;
   create table test as
   select cik, count(0) as count
   from wrds_cik_a
   group by cik
   having count(0) > 1;
quit;

The output should be zero records, meaning your keys (group by variables) uniquely identify each record.

 

For a LEFT JOIN, you can have dups in the left table, but if you have dups in the right table, your target table will "explode".


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1831 views
  • 0 likes
  • 5 in conversation