Desktop productivity for business analysts and programmers

How to convert SAS merge to SQL

Reply
Contributor
Posts: 25

How to convert SAS merge to SQL

HI,

 

I am new to SAS programming.

 

I am looking to convert below code in SQL

 

data strat.merge_calls_1;
merge strat.trans_1 (in=jkl) strat.pay_merge2 (in=iop);
by wclientcode szaccountnumber;
if jkl and iop;
run;

 

Can someone please tell me how merge will work here ?

 

Thanks In Advance

 

 

Esteemed Advisor
Posts: 6,646

Re: How to convert SAS merge to SQL

Strictly speaking, you can't, as the data step merge behaves in a certain way when dealing with multiple instances in both datasets that can't (easily) be reproduced in SQL. At least I never tried.

If you have a 1:1 or a 1:n relationship guaranteed, then

proc sql;
create table strat.merge_calls_1 as
select * from
strat_trans_1 a, strat_pay_merge2 b
where
a.wclientcode = b.wclientcode and
a.szaccountnumber = b.szaccountnumber
;
quit;

will do the trick.

But be advised that SQL joins of large datasets might carry a huge performance penalty vs. the data step merge.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 25

Re: How to convert SAS merge to SQL

Hi, Thank you for feedback.

Data has n:n relations, I tried with Inner Join as will with where condition but SAS Merge output not matches with SQL output.

Please help

Thanks
Esteemed Advisor
Posts: 6,646

Re: How to convert SAS merge to SQL

That is because of the behaviour of the data step merge with N:N relations.

Say, dataset a has three obs for by value x, and dataset b has 2.

During the merge, a-x-1 will be merged with b-x-1, a-x-2 will be merged with b-x-2, and a-x-3 will be merged with b-x-2, as there are no more obs in dataset b for value x. This results in 3 (maximum number of both datasets) observations.

SQL, OTOH, will merge all instances with each other, resulting in 6 observations in my example.

This is why I said that, strictly speaking, you can't replace a data step merge with SQL in a quick way. OTOH, the SQL behaviour may be desired in many cases, because you can't (easily) create the cartesian join with the data step.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Esteemed Advisor
Posts: 5,198

Re: How to convert SAS merge to SQL

As @KurtBremser states, they behave differently. And I would say in a N:N relationship it's nearly impossible to get the equal result.

But why do you need to change to SQL? 

One reason could be that in N.N relationship, the SQL way handling it is more predictable. But you on other hand wish the same result - so please explain the underlying requirement.

Data never sleeps
Contributor
Posts: 25

Re: How to convert SAS merge to SQL

Dear All,

 

Thank you very much for your feedback. Herewith I am attaching excel file which have 4 tabs

 

1. Data in Table1

2. Data in Table2

3. SAS output of below Merge statement

 

data strat.merge_calls_1;
merge strat.Table1 (in=jkl) strat.Table2 (in=iop);
by clientcode accountnumber;
if jkl and iop;
run;

 

4. SQL output of equivalent SQL statement for above merge statement

 

SELECT * FROM Table1 INNER JOIN Table2
ON Table1.ClientCode = Table2.ClientCode
AND Table1.AccountNumber = Table2.AccountNumber

 

As per documentation, Merge statement is equivalent to SQL inner join. But SAS output and SQL output are different.

 

Can someone please guide us what we are missing here ?

 

Thanks In Advance

Esteemed Advisor
Posts: 6,646

Re: How to convert SAS merge to SQL

Please post the log of the data step.

I will NOT open a MS Office document from the web. If you want to supply test data, do so in a data step with

cards;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 25

Re: How to convert SAS merge to SQL

Hi,

 

Greetings

 

I have created sample program as below to illustrate the same

 

PROGRAM:

 

 

data Test.Tab1;

input ClientCode 1-3 AccountNumber 4-5 dtTransactionDate$ 6-28;

datalines;

-1         3          07JAN2016:19:43:30.000

-1         3          12JAN2016:10:32:28.000

-1         3          25JAN2016:14:34:07.000

-1         3          02FEB2016:14:36:42.000

-1         3          17FEB2016:11:59:42.000

-1         3          22FEB2016:13:24:37.000

-1         3          03MAR2016:15:48:23.000

-1         3          08MAR2016:14:52:43.000

-1         3          21MAR2016:13:20:05.000

-1         3          31MAR2016:11:45:24.000

-1         3          05APR2016:11:43:41.000

-1         3          13APR2016:08:01:06.000

-1         3          13APR2016:08:03:27.360

-1         3          19APR2016:12:16:21.000

-1         3          25APR2016:16:35:46.000

-1         3          03MAY2016:13:25:13.000

-1         3          27MAY2016:10:41:50.000

-1         3          01JUN2016:09:30:26.000

-1         3          01JUN2016:09:37:02.000

;

run;

 

 

proc sort data=Test.Tab1;

by ClientCode AccountNumber;

run;

 

 

data Test.Tab2;

input ClientCode 1-3 AccountNumber 4-5 dtPaymentDate$ 6-28 curpayment 29-32 aa 33-34 bb 35-36 dtpaymentdate2$ 37-59 curpayment2 60-63 unqkey 64-65;

datalines;

-1         3          06NOV2015:00:00:00.000     350      1          0          26JAN2016:00:00:00.000       500      1

-1         3          26JAN2016:00:00:00.000       500      2          1          01APR2016:00:00:00.000      450      2

-1         3          01APR2016:00:00:00.000      450      3          2          26APR2016:00:00:00.000      500      3

-1         3          26APR2016:00:00:00.000      500      4          3          31MAY2016:00:00:00.000     500      4

;

run;

 

proc sort data=Test.Tab2;

by ClientCode AccountNumber;

run;

 

data Test.Tab_Merge;

            merge Test.Tab1 (in=jkl) Test.Tab2 (in=iop);

            by ClientCode AccountNumber;

 

            if jkl=1 and iop=1;

run;

proc print data=Test.Tab_Merge;

run;

 

Output Of Merge Statement is as below

clientcodeAccountnumberdtTransactionDateclientcodeAccountnumberdtPaymentDatecurpaymentaabbdtpaymentdate2curpayment2unqkey
-1307JAN2016:19:43:30.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1312JAN2016:10:32:28.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1325JAN2016:14:34:07.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1302FEB2016:14:36:42.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1317FEB2016:11:59:42.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1322FEB2016:13:24:37.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1303MAR2016:15:48:23.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1308MAR2016:14:52:43.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1321MAR2016:13:20:05.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1331MAR2016:11:45:24.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1305APR2016:11:43:41.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1313APR2016:08:01:06.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1313APR2016:08:03:27.360-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1319APR2016:12:16:21.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1325APR2016:16:35:46.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1303MAY2016:13:25:13.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1327MAY2016:10:41:50.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1301JUN2016:09:30:26.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001
-1301JUN2016:09:37:02.000-1306NOV2015:00:00:00.0003501026JAN2016:00:00:00.0005001

 

It contains 19 rows, however its equivalent SQL Inner join gives 76 rows.

 

Can someone please assist me, how SAS merge gives 19 rows as output ?

 

Thanks In Advance

Esteemed Advisor
Posts: 6,646

Re: How to convert SAS merge to SQL

Read my previous post again where I described how the data step merge works. What you see is perfectly normal behaviour for a data step merge when you have a M:N relationship.

If you want to see all possible combinations, you must use SQL. Period.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 25

Re: How to convert SAS merge to SQL

Hi,

 

Thanks for feedback.

 

So I need to search other ways than INNER JOIN to produce identical output in SQL.

Will try to creat some routines for same in SQL, if possible can you suggest us a way to achieve it in SQL ?

 

Thanks In Advance

 

 

Esteemed Advisor
Posts: 6,646

Re: How to convert SAS merge to SQL

[ Edited ]

Why do you need to do it in SQL, anyway? If your result with the data step is fine, then stay with it. No need to jump through hoops just to achieve the same result in SQL. Not even as an exercise, as SQL just isn't meant for this kind of join.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Learner
Posts: 1

Re: How to convert SAS merge to SQL

We need to convert merge statement of SAS data step to inner join for SQL Server stored procedure,not for proc sql.Please guide us on that.

Thanks

Esteemed Advisor
Posts: 6,646

Re: How to convert SAS merge to SQL

The only idea that came to my mind for doing that in SQL is

- first, for every BY group in the two tables, create a counter variable, starting with 1 in every group

- then, do an outer join on the original BY variable and the counter variable.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Esteemed Advisor
Posts: 5,198

Re: How to convert SAS merge to SQL

No.
Data never sleeps
Ask a Question
Discussion stats
  • 13 replies
  • 787 views
  • 0 likes
  • 4 in conversation