Hi How do I convert the following merge code to SQL?
data SAP_Bill.erdo_main(sortedby=opbel); merge SAP_Bill.erdo_max SAP_Bill.erdk(in=a); by opbel; label erdk_frei_am='Invoice_Release_Date'; if a and first.opbel then output; run;
I have included the names bwlow of the columns from the input tables and the output table, i would like the SQL code to have the same Output
Erdo_Max Table (Input) | ||||||||
opbel | erdk_frei_am | |||||||
ERDK Table (Input) | ||||||||
intopbel | vkont | opbel | total_amnt | budat | simulated | stokz | tobreleasd | |
ERDO_Main (Output) | ||||||||
opbel | erdk_frei_am | intopbel | vkont | total_amnt | budat | simulated | stokz | tobreleasd |
Thanks In Advance
If you have a working and tested data-step, don't waste your time to convert it to sql. I don't know any case in which sql performed better.
Hi I agree, However I have to convert this to SQL because i have to re-write the Scripts in Hadoop
Why are you trying to elemental SAS?
Our Servers are moving from SAS server to a Hadoop Server
Hi,
What i understand from your code is, you want to left join both the tables and want only those records for the first instance of the id.
below is the code which can do it for you, you need to use your column names.
proc sql;
create table final as select *,monotonic() as rowid from test as a left join test1 as b
on a.opbel=b.opbel
group by a.opbel
having rowid=min(calculated rowid);
quit;
Hi I am a little confused how the code works, where would I put my column names
columns are being selected by *(all columns) if you want to specify only few columns, then specify the names in place of *
There is no way for SQL to replicate this logic, with a 100% guarantee. SQL makes no promises as to the order of the incoming observations, thus any tool that looks for the first observation might produce a different result in SQL vs. SAS (or even on different SQL runs of the same program). That includes the monotonic function.
If you will keep SAS on at least one server, what you can do is this:
data SAP_Bill.erdo_main (sortedby=opbel) / view=SAP_Bill.ordo_main;
merge SAP_Bill.ordo_max SAP_Bil.erdk (in=a);
by opbel;
label erdk_frei_am='Invoice_Release_Date';
if a abnd first.opbel then output;
run
That's the same program, but using a view instead of a data set. Then later:
proc sql;
create table somewhere_on_hadoop as
select * from SAP_Bill.erdo_main;
quit;
I'm not sure if that's helpful or not, since it still requires SAS. But there is no way that I know of to have SQL guarantee which observation will be "first" within a group.
Unfortunately, I am unable to access the SAS Server from Hadoop
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.