DATA Step, Macro, Functions and more

keeping only particular observations in full join in proc sql

Reply
Regular Contributor
Posts: 234

keeping only particular observations in full join in proc sql

I am using full join in Proc Sql to join two datasets. There are multiple same values of  key variable in both datasets so "many to many" join has been observed.  Here is the small part of  datasets (ie. subset of data set ).

 

data a;

 input id var1  date;

cards;

A   Car   1

A   Car   2

;

run;

 

data b;

 input id var1  va2;

cards;

A   Car    100

A   Car    200

;

run;

 

Here is the code I am using. 

 

proc sql noprint;

  create table t1 as

   select a.*,b.var2

      from  a as a

               full join 

                 b as b

                   on a.id=b.id and

                        a.var1=b.var1

quit;

 

Desired output:

id var1 date var2

A  Car   1       100

A  Car   2       200

 

Can many to many merge be prevented in full join with multiple key variables?

 

Thanks !

Respected Advisor
Posts: 4,654

Re: keeping only particular observations in full join in proc sql

SQL operations cannot depend on observations order.; but data step programs can:

 

data t2;
set a;
set b;
by id var1;
run;

proc print data=t2 noobs; run;
PG
Regular Contributor
Posts: 234

Re: keeping only particular observations in full join in proc sql

@PGStats.  Thank you. I have already applied full join in my large data set. The problem is occuring only when both key variable (s) have multiple same values and third key variable is not availabe for  joining purpose. 

Super User
Posts: 5,260

Re: keeping only particular observations in full join in proc sql

Typically you can't, this is of of the main characteristics of SQL. The best to prevent this is make sure that your source tables have the appropriate layout to used in a join.
Data never sleeps
Regular Contributor
Posts: 234

Re: keeping only particular observations in full join in proc sql

@LinusH.  Is there any way I can create third variable to merge in data step.  Thanks !

Super User
Posts: 5,260

Re: keeping only particular observations in full join in proc sql

Usually, it's hard to create a new key with a lower granular level on an existing table. Your sample data show too little of the possibilities. But it depends on the data. The best if you could get back to the data provider and ask for more details, and have tables that have a true primary key.

 

The other "leg" is what you actually want to achieve by matching those tables. What is the business rule/requirement?

Data never sleeps
Ask a Question
Discussion stats
  • 5 replies
  • 291 views
  • 3 likes
  • 3 in conversation