Re: The Order of SQL JOIN Result

Slash · Posted 06-27-2017 09:10 AM

Hi, guys

Recently, I've met a strange problem. I can't understand that. Hopefully, you guys can explain to me. Here is it:

Let's see an basic result:

/* 2 variables and 1 observation in table1 */
data table1;
	id="001";
	num=uniform(1);
run;

/* 2 variables and 5 observation in table2*/
data table2;
   	input id $ score;
cards;
001 80
001 80
001 90
001 60
001 85
;
run;

/* Inner Join table1 and table2*/
proc sql;
	create table result as
	select a.*,b.score
	from table1 a inner join table2 b
	on a.id=b.id;
quit;
proc print;run;

The Data Set Result is:

Notice the order of variable SCORE, is just same as the original order in table2.

But when the number of variables in table1 increases, magical thing happened. Let's see:

/* 12 variables and 1 observation in table1 */
data table1;
	id="001";
	array num{10} ;
	do i=1 to 10;
		num{i}=uniform(i)+i;
	end;
run;

/* 2 variables and 5 observation in table2 */
data table2;
	input id $ score;
cards;
001 80
001 80
001 90
001 60
001 85
;
run;

/* Inner Join table1 and table2*/
proc sql;
	create table result as
	select a.*,b.score
	from table1 a inner join table2 b
	on a.id=b.id;
quit;
proc print;run;

The order of variable SCORE in Data Set RESULT has changed:

So, I'm confused. Why did it change?

Next, I run a test, to see the relation between the number of the variables in data set table1 and the order of varialbe SCORE in data set RESULT. Here is the code:


%macro create_table1(count);
%local i;
data table1;
	id="001";
	%do i=1 %to &count;
		num&i=uniform(&i);
	%end;
run;
%mend create_table1;


%macro test_join;

proc sql;
	drop table differ_all;
quit;

%local i;
%do i=1 %to 20;

	%create_table1(&i)

	proc sql;
		create table result&i as
		select a.*,b.score
		from table1 a inner join table2 b
		on a.id=b.id;
	quit;


	data differ&i;
		merge result(keep=score rename=(score=base)) result&i(keep=score rename=(score=new)) end=last;
		length score_base score_new $ 20 differ_flag $ 5;
		retain score_base score_new;
		score_base=catx(',',score_base,base);
		score_new=catx(',',score_new,new);
		

		if last then do;
			var_count=&i+1;/*plus the variable ID*/
			differ_flag=ifc(score_base=score_new,'true','false');
			output;
		end;
		keep var_count score_base score_new differ_flag;
	run;


	proc append base=differ_all data=differ&i force;
	run;
%end;

proc print data=differ_all;run;

%mend test_join;



data table1;
	id="001";
	num=uniform(1);
run;
data table2;
	input id $ score;
cards;
001 80
001 80
001 90
001 60
001 85
;
run;
proc sql;
	create table result as
	select a.*,b.score
	from table1 a inner join table2 b
	on a.id=b.id;
quit;


%test_join

And the result is:

When the number of variables of data set table1 equals 11, then order of SCORE in join table has changed. I test a few times, it's the same.

So, anyone can tell me the reason? And I can't find anything about that in SAS Help and Documents.

SuzanneDorinski · Posted 07-15-2017 06:39 PM

See SQL Joins -- The Long and The Short of It, available at https://support.sas.com/techsup/technote/ts553.html.

If you add _method to the PROC SQL statement, you'll get notes in the log show the SQL execution methods chosen.

The notes in the log for the first 9 iterations of your test_join macro look like this:

 NOTE: SQL execution methods chosen are:
 
       sqxcrta
           sqxjhsh
               sqxsrc( WORK.TABLE2(alias = B) )
               sqxsrc( WORK.TABLE1(alias = A) )
 NOTE: Table WORK.RESULT1 created, with 5 rows and 3 columns.

The notes in the log for the 10th through 20th iterations of your test_join macro look like this:

 NOTE: SQL execution methods chosen are:
 
       sqxcrta
           sqxjhsh
               sqxsrc( WORK.TABLE1(alias = A) )
               sqxsrc( WORK.TABLE2(alias = B) )
 NOTE: Table WORK.RESULT10 created, with 5 rows and 12 columns.

Slash · Posted 08-01-2017 12:22 AM

Thanks! That is very helpful.

The Order of SQL JOIN Result

Re: The Order of SQL JOIN Result

Re: The Order of SQL JOIN Result

SAS Innovate 2025: Register Now

SAS Training: Just a Click Away