BookmarkSubscribeRSS Feed
Aexor
Lapis Lazuli | Level 10

Hi All,

 

I need you help to understand will too many left joins will impact performance. I am currently testing in sample data and my code is working fine, But I am getting this feeling that for larger volume the joins will take more time.

 

snippet of my code where I am using 7 joins

 

proc sql noprint;
        create table work.mandate_unapproved_res_tmp1 as
        select 
     t1.order_id as plan_id,
             t1.sales_units_qty,
             t1.revenue_amt, 
t1.revenue_no_vat_amt, 
t1.margin_amt,
             t1.promotion_spend_amt,
             t1.cost_amt,  
             t2.plan_nm as plan_nm,
             t2.plan_desc as plan_desc,
             t2.start_dt as start_dt,
             t2.end_dt as end_dt,
             t2.plan_status_no as plan_status_no,
             t2.plan_approval_flg as plan_approval_status_no,
             t3.price_type_no as price_type_no,
             t3.price_value_no as price_value_no,
             t3.price_value_rec_flg as price_value_rec_flg,
             t3.code as vehicle_cd, 
             t4.num_prod_no as num_prod_no,
             t5.num_geo_no as num_geo_no,
             t6.num_prod_geo_no as num_prod_geo_no,
             t7.rule_val as obj_func               
        from work.mandate_unapproved_res_t1 t1
        left  join &gv_pricing_libname..main_plan t2
        on upcase(t1.order_id)=upcase(t2.order_id)
       left  join work.pp_vehicle t3
        on upcase(t1.order_id)=upcase(t3.order_id)
       left  join work.unapproved_prod_sku t4
        on upcase(t1.order_id)=upcase(t4.order_id)
       left  join work.unapproved_geo_sku t5
        on upcase(t1.order_id)=upcase(t5.order_id)
       left  join work.temp_prd_geo_cnt t6
        on upcase(t1.order_id)=upcase(t6.order_id)
      left join work.mandate_plan_rule t7
        on upcase(t1.order_id)=upcase(t7.order_id);
      quit;
 
Any suggestion or help will be much appreciated.
 
Thanks!

 

 

2 REPLIES 2
awesome_opossum
Obsidian | Level 7

If you mean computing performance, something you might consider is that proc sql allows for multi-threaded processing.  SAS defaults to 4 processors, but if your computer/system has more processors, you can increase that number. 

 

/* evaluate CPU usage default for new session / since last CPU setting during session */ 
proc options option=cpucount;
run; /* default CPU count = 4 */ 

/* count and use all CPUs available for multi-threaded procs */ 
options threads cpucount=actual;
proc options option=cpucount;
run; /* count and use max CPU count available; my comp = 20 */ 

/* set CPUs count to use manually for multi-threaded procs */ 
options threads cpucount=19;
proc options option=cpucount;
run; /* max CPU count (20) - 1 = 19 (as not to overload other programs/processing) */ 

 

 

Patrick
Opal | Level 21

The SAS SQL optimizer got its limits and joins with a lot of tables can result in a sub-optimal execution path. SQL options _method and _tree write to the SAS log how SAS executes the joins.

Proc SQL needs to implicitly sort the tables along the join key. In your case you join all tables with the same column so a single sort per table will suffice. I would expect this join to work as efficiently as possible.

Multithreading only happens for the implicit sort operations and the installation defaults are normally appropriate. You should only have to change this in rare occasions.

 

If your base table is the big table and the lookup tables for left joins are the "small" ones then another approach is using SAS datastep hash tables. That normally beats any other approach in regards of performance because it avoids the need for any sorting.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 377 views
  • 4 likes
  • 3 in conversation