DATA Step, Macro, Functions and more

SAS Code Optimization

Reply
Occasional Contributor
Posts: 10

SAS Code Optimization

[ Edited ]
proc sql;
create table old_product_sales as
select
SFDC_Member_ID,
(sum(case when Gluten_free not in ('N/A') then Unit_Price else 0 end)/sum(Unit_Price))*100 as CP_Gluten_Free format comma10.2,
(sum(case when Non_GMO not in ('N/A') then Unit_Price else 0 end)/sum(Unit_Price))*100 as non_gmo_sales format comma10.2,
(sum(case when Organic not in ('N/A') then Unit_Price else 0 end)/sum(Unit_Price))*100 as CP_Organic format comma10.2,
(sum(case when Organic_Validated not in ('N/A') then Unit_Price else 0 end)/sum(Unit_Price))*100 as Organic_Valid_sales format comma10.2,
(sum(case when Whole_Trade_Guarantee not in ('N/A') then Unit_Price else 0 end)/sum(Unit_Price))*100 as CP_Whole_Trade format comma10.2
from old_tran_prod_sync_master1
where estdate GT '28Jun2017'd- &x2.  and estdate LE '28Jun2017'd 
group by 1;
quit;

I have a few codes (one of the log below). Its taking long, Is there a way to optimise this?

1))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))

413 proc sql;
414 create table Tran_Prod_Sync_master_1a1 as select
415 Activity_Timestamp,
416 Adjusted_Timestamp,
417 Base_Points,
418 Bonus_Points,
419 Last_Update,
420 Profit,
421 SB_Internal_ID,
422 SB_Member_ID,
423 SB_Transaction_ID,
424 SFDC_Member_ID,
425 case when Store_Number=10058 then 10496
426 when Store_Number=10067 then 10514
427 else Store_Number end
428 as Store_Number,
429 Transaction_Reference,
430 case when Transaction_Type_Reference='GMP' then 'NRP'
431 when Transaction_Type_Reference='GMR' then 'NRR'
432 else Transaction_Type_Reference end
433 as Transaction_Type_Reference,
434 UPC,
20 The SAS System 08:00 Saturday, July 15, 2017

435 Unit_Price
436 from tsync.Transactions_SB_Affinity_Product
437 where Store_Number in (10052,10053,10058,10067,10086,10100,10112,10187,10249,10353,10526,10496,10514) and
437 ! length(SFDC_Member_Id)=18;
NOTE: Table WORK.TRAN_PROD_SYNC_MASTER_1A1 created, with 181786120 rows and 15 columns.

438 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 14:44.42
user cpu time 2:10.62
system cpu time 54.96 seconds
memory 5444.96k
OS Memory 25708.00k
Timestamp 07/15/2017 08:19:15 AM
Step Count 44 Switch Count 1467

 

 

2)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))

LOG for code::
SYMBOLGEN: Macro variable X2 resolves to 84
856 group by 1;

NOTE: Table WORK.OLD_PRODUCT_SALES created, with 235712 rows and 6 columns.

857 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 21:55.01
user cpu time 1:07.45
system cpu time 1:20.09
36 The SAS System 03:09 Monday, July 17, 2017

memory 1059590.82k
OS Memory 1078904.00k
Timestamp 07/17/2017 05:19:47 AM
Step Count 80 Switch Count 2555

PROC Star
Posts: 736

Re: SAS Code Optimization

This is very hard to read, please post your code by clicking the little running man at the top of your menu, when you post Smiley Happy

Super User
Super User
Posts: 7,942

Re: SAS Code Optimization

Well, from the limited almost unreadble snippet you posted there, the first question would be why not do it in a datastep, this would likely be as quick if not quicker and far easier to read (and with that size of data its better to use datastep as SQL can be resource hungry):

data want;
  set tsync.transactions_sb_affinity_product
    (where=(store_number in (10052,10053,10058,10067,10086,10100,10112,10187,10249,10353,10526,10496,10514) 
and length(sfdc_member_id));
if store_number=10058 then store_number=10496;
...
run;

However at the end of the day that is a large amount of data and will take time however you do it.

Super User
Posts: 7,768

Re: SAS Code Optimization

15 minutes for ~200 million observations isn't that bad.

Your timer values show that your CPU time is around three minutes, while real time is about five times as much. This points to lots of I/O waits.

Is that a native SAS dataset, or are you accessing a remote DBMS?

If the latter, pull the source dataset to SAS first and run the query again on the local dataset.

If the former, you need faster/better storage.

Consider striping of disks, replacing spinning metal with SSDs, and so on.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Ask a Question
Discussion stats
  • 3 replies
  • 98 views
  • 0 likes
  • 4 in conversation