BookmarkSubscribeRSS Feed
zdassu
Quartz | Level 8

Hi How do I convert the following merge code to SQL?

 

data SAP_Bill.erdo_main(sortedby=opbel);
        merge   SAP_Bill.erdo_max
  SAP_Bill.erdk(in=a);
        by              opbel;
        label   erdk_frei_am='Invoice_Release_Date';
        if a and first.opbel then output;
run;

I have included the names bwlow of the columns from the input tables and the output table, i would like the SQL code to have the same Output

 

Erdo_Max Table (Input)        
opbelerdk_frei_am       
         
ERDK Table (Input)        
intopbelvkontopbeltotal_amntbudatsimulatedstokztobreleasd 
         
ERDO_Main (Output)        
opbelerdk_frei_amintopbelvkonttotal_amntbudatsimulatedstokztobreleasd

 

Thanks In Advance

9 REPLIES 9
andreas_lds
Jade | Level 19

If you have a working and tested data-step, don't waste your time to convert it to sql. I don't know any case in which sql performed better.

zdassu
Quartz | Level 8

Hi I agree, However I have to convert this to SQL because i have to re-write the Scripts in Hadoop

VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

Why are you trying to elemental SAS?

 

zdassu
Quartz | Level 8

Our Servers are moving from SAS server to a Hadoop Server

 

Shivam
Calcite | Level 5

Hi,

What i understand from your code is, you want to left join both the tables and want only those records for the first instance of the id.

below is the code which can do it for you, you need to use your column names.

proc sql;
create table final as select *,monotonic() as rowid from test as a left join test1 as b 
on a.opbel=b.opbel
group by a.opbel
having rowid=min(calculated rowid);
quit;
zdassu
Quartz | Level 8

Hi I am a little confused how the code works, where would I put my column names

Shivam
Calcite | Level 5

columns are being selected by *(all columns) if you want to specify only few columns, then specify the names in place of *

Astounding
PROC Star

There is no way for SQL to replicate this logic, with a 100% guarantee.  SQL makes no promises as to the order of the incoming observations, thus any tool that looks for the first observation might produce a different result in SQL vs. SAS (or even on different SQL runs of the same program).  That includes the monotonic function.  

 

If you will keep SAS on at least one server, what you can do is this:

data SAP_Bill.erdo_main (sortedby=opbel) / view=SAP_Bill.ordo_main;
   merge SAP_Bill.ordo_max SAP_Bil.erdk (in=a);
   by opbel;
   label erdk_frei_am='Invoice_Release_Date';
   if a abnd first.opbel then output;
run

That's the same program, but using a view instead of a data set.  Then later:

proc sql;
   create table somewhere_on_hadoop as 
   select * from SAP_Bill.erdo_main;
quit;

I'm not sure if that's helpful or not, since it still requires SAS.  But there is no way that I know of to have SQL guarantee which observation will be "first" within a group.

 

 

zdassu
Quartz | Level 8

Unfortunately, I am unable to access the SAS Server from Hadoop

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1311 views
  • 0 likes
  • 5 in conversation