BookmarkSubscribeRSS Feed
HeatherNewton
Quartz | Level 8
data wdata.application (index=(application_ref_no));
    merge wdata.appication pmam_static pmsm_static;
    by application_ref_no;
run;

If I have a merge case like this, how can I write it into sql and get exactly the same result?

each of these data sets comes with many variables.

what if it is one to many case  and what if it is many to many case?

is it possible to write a program to interpret if if no joins in sql can produce same result? what do I need to know to start doing it...

4 REPLIES 4
PaigeMiller
Diamond | Level 26

First, not all merges can be converted into SQL and get exactly the same results. While much of the capabilities of SQL joins and DATA step merges overlap, some things are unique to either SQL or DATA steps.

 

So your very general questions, about what happens in certain situations cannot be answered completely without specific data sets to merge or join.

 

Even in the case you provide code for, without specific data sets we can't be 100% sure we can write the code.

 

 

 

 

--
Paige Miller
Tom
Super User Tom
Super User

If that is a many to many merge you are probably not going to like the results of that data step.  That three way merge only makes sense if it is 1 to 1 to 1. 

 

If you want to mimic a two dataset many to many merge with an SQL join then the first thing to do is add a counter for each observation the by groups to both datasets (something else that is impossible to do with only SQL syntax).   Let's assume you have added a variable REP that counts the observations for the same value of application_ref_no.

 

Then to join two datasets the join criteria might be:

select coalesce(right.application_ref_no,left.application_ref_no) as application_ref_no
    , max(left.rep,right.rep) as rep
    , .... other variables ...
from data1 left 
full join data2 right
on left.application_ref_no = right.application_ref_no
  and ((left.rep = right.rep)
       or (left.rep=max(left.rep) and right.rep > max(left.rep))
       or (right.rep=max(right.rep) and left.rep > max(right.rep))
group by calculated application_ref_no
order by application_ref_no, rep

And the ... other variables ... part will also be complicated if there are variables that are in both datasets.  In a data step merge the values from the dataset listed last "win", like the way I generated the combined application_ref_no varaible.

 

And extending that to mimic a three way data step merge is probably going to get mush harder.

Patrick
Opal | Level 21

Suggest you read (and understand) all the SAS Docu links posted here: https://communities.sas.com/t5/SAS-Programming/SAS-merge-SQL-equivalent/m-p/798286#M313828 

SAS merge and SQL Join statements are convertible with the exception of many:many relationships. 

 

It's really important that you understand the differences as this will allow you to take advantage of it plus will avoid that you implement code that returns results you didn't expect. RTM!

Kurt_Bremser
Super User

Maxim 3: Know Your Data.

Before this step runs (as it seems to overwrite the input dataset), check if there are multiple observations for a given application_ref_no in any of the datasets (I usually use PROC SORT with NODUPKEY and a temporary OUT= dataset, and look at the log).

Also, look at the variables contained in the datasets. Are there variables with equal names in more than one dataset (apart from the BY variable)?

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 396 views
  • 0 likes
  • 5 in conversation