hi guys,
I am a bit confused regarding using the join or the join statements, we don't know which is more efficient, option 1 or option 2. We are using SAS EG, 6.1 and the data is quite huge in the tables, almost 6Million records.
We have used the below option 1 to fetch the required datsets using the sub-query but there another approach which others as suggesting to use the option 2 to use the joins.
As per them, option 1 splits the query as 'where' and 'OR' clause, and split of all the product ids,
like 'where product_id = 782656 or product_id =78555 or product_id = 55421268 and so on.....'
/* dataset having the required product_idd with price>100 */
proc sql;
create table all_items as select distinct(product_id) from catalogue where price >100;
quit;
/*Option1:*/
proc sql;
create table items as
select product_id, item_id, price, description from catalogue where product_id in (select * from all_items);
quit;
/*Option2:*/
proc sql;
create table items as
select a.product_id, a.item_id, a.price, a.description from catalogue as a
inner join all_items as b
on a.product_id = b.product_id;
quit;
However as per my understanding both queries are taking same time, does sas automatically optimises the proc sql queries and gives the same result and in same cpu and time usage?
Thanks for the help!
... View more