BookmarkSubscribeRSS Feed
ChrisNZ
Tourmaline | Level 20

1. All the comments made before are entirely valid.

The most valid one is: show us more.

If you call a garage and state that your car runs poorly and don't give more details, the chances of a useful diagnostic are zero.

 

2. You are showing 2 steps here. How long do they take?
To repeat the message you seem to stubbornly ignore: where's the log?

 

3. This code hardly does anything. Show us the log please. It looks like you must have slow disks if this simple logic takes time.

 

3. If you have slow disks, large (and small) tables are best stored in SPDE format with binary compression turned on. This lowers disk access, and also allows on-the-fly sorting, which is rather efficient.
Since you mention sorting as an issue (but then you show us a data step as an example of a slow step for some reason) this could be a double-win.

 

4. If you have data steps doing so little processing, it is possible (and common) that the code multiplies time-consuming baby steps instead of having fewer smarter steps.

 

6. How long does the proc datasets take? If you have slow libraries (read RDBMS, or very sadly SPDE), this deletion could take a while for no good reason. Replace the proc datasets step with:

proc delete data=AY; run;

 

JJP1
Pyrite | Level 9

HI @ChrisNZ ,

Previously i thought  proc sort with out really seeing the full code.

and then i followed as @kurt suggested and found that the pasted data step taking more time compared with others sections of code please. thanks for the practical advise which i can easily understand

 

log part has PII info. how can i share please. how can i mask. the logs is very big please adivse

NO RDMS pure SAS tables.

Actualy i can share whole code but this has PII info.to mask i think manually it will take more time. please suggest if you have any options for me so that i can share log along with the code and to get the best help from you please

Ksharp
Super User

As ChrisNZ said, try option and spde engine(which have some drawback ) .

1)
options bufno=100 bufsize=128k compress=yes threads cpucount=4 ;


2)
libname x spde 'c:\temp';

Kurt_Bremser
Super User

If your data step throws lots of messages to the log, this can be part of the problem. If that is the case, sanitize your data steps(s) so that only the essential NOTEs are displayed:

 NOTE: There were 19 observations read from the data set SASHELP.CLASS.
 NOTE: The data set WORK.CLASS has 19 observations and 5 variables.
 NOTE:  Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit):
       real time           0.00 seconds
       cpu time            0.00 seconds

A data step that comes back with more than these needs to be fixed. No "invalid data", "numeric converted to character", "character converted to numeric", "missing values because of ..." allowed.

 

If in doubt, show us the summary lines of your long-running data step; these should not contain critical information (you may want to mask the dataset names).

JJP1
Pyrite | Level 9

Thanks @Kurt_Bremser . please have a look on code snippet  and log  below taking nearly starting from 02:49:17 AM and ending at 07:04:57 AM to complete.

 

 data ytregwve (drop=uhndshgdyhrg werrwerrtyu);
set qaszxdc;
by gggggg rererrt trwevds;
if uhndshgdyhrg      = 'TESTETGED' then AAA  = werrwerrtyu;
else if uhndshgdyhrg = 'ERHWGRB'  then fdgfgd = werrwerrtyu;
else if uhndshgdyhrg = 'TGBE' then clcpc  = werrwerrtyu;
else if uhndshgdyhrg = 'YHHGE' then lalal   = werrwerrtyu;
else if uhndshgdyhrg = 'TGBV' then ewrwe = werrwerrtyu;
else if uhndshgdyhrg = 'YUHR' then clexgrat = werrwerrtyu;
else if uhndshgdyhrg = 'EDRF' then clmanxs  = werrwerrtyu;
else if uhndshgdyhrg = 'YUJF' then dftg    = werrwerrtyu;
else if uhndshgdyhrg = 'UIWER' then trtrtr  = werrwerrtyu;
else if uhndshgdyhrg = 'UIKJE' then fdffd  = werrwerrtyu;
else if uhndshgdyhrg = 'IKERDF' and xxx in ('YHHGE' 'YUYEW' 'UHYER' 'UHYT') then do;
if rterttjki = '1' then ewr = 'SD';
else ewr = 'WE';
end;
 
retain AAA lalal fdgfgd clcpc trtrtr fdffd
ewrwe clexgrat clmanxs dftg 0 ewr " ";
 
if last.trwevds then do;
output;
AAA  = 0;
fdgfgd = 0;
clcpc  = 0;
lalal   = 0;
ewrwe = 0;
clexgrat = 0;
clmanxs  = 0;
dftg    = 0;
trtrtr  = 0;
fdffd  = 0;
ewr    = " ";
end;
run;
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
      1672:14   
NOTE: There were 181103546 observations read from the data set WORK.qaszxdc.
NOTE: The data set WORK.ytregwve has 25969878 observations and 67 variables.
NOTE: Compressing data set WORK.ytregwve decreased size by 35.41 percent. 
      Compressed is 335475 pages; un-compressed would require 519399 pages.
NOTE: DATA statement used (Total process time):
      real time           24:39.42
      user cpu time       2:24.38
      system cpu time     1:03.03
      memory              3948.53k
      OS Memory           13348.00k
      Timestamp           01/06/2020 04:47:32 o'clock
      Page Faults                       675259
      Page Reclaims                     124349
      Page Swaps                        0
      Voluntary Context Switches        1887
      Involuntary Context Switches      57128
      Block Input Operations            0
      Block Output Operations           0
      

                                                             Directory

                            Libref             WORK                                                    
                            Engine             V9                                                      
105                                                        The SAS System                                 02:36 Monday, June 1, 2020

                                                             Directory

                            Physical Name      /home/healthcare/work2/SAS_workCBF90082018E_uknwsaviv764
                            Filename           /home/healthcare/work2/SAS_workCBF90082018E_uknwsaviv764
                            Inode Number       487424                                                  
                            Access Permission  rwxrwxrwx                                               
                            Owner Name         sasleg                                                  
                            File Size (bytes)  4096                                                    


                                                Member
                                  #  Name       Type       File Size  Last Modified

                                  1  qaszxdc    DATA     53220147200  01-Jun-20 04:22:42         
                                  2  ytregwve  DATA      8244641792  01-Jun-20 04:47:32         

 

 

ChrisNZ
Tourmaline | Level 20

First test. Replace:

data ytregwve (drop=uhndshgdyhrg werrwerrtyu);
  set qaszxdc;

with:

%let wdir=%sysfunc(pathname(WORK));
libname W spde "&wdir" partsize=100g compress=binary;
data W.ytregwve (drop=uhndshgdyhrg werrwerrtyu);
  set qaszxdc(bufno=100);

and tell us the results.

 

NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
      1672:14   

What's on line  1672?

 

Kurt_Bremser
Super User

I see that you have real time of 24 minutes, but CPU time of only 3.5 minutes. This points to either insufficient (in terms of throughput and access times) or congested storage for your WORK location. Or there simply were so many jobs running at the same time that you only got a small piece of the server for your job.

 

Get in touch with your server admins to check out what is actually happening (too many jobs vs. lots of wait states).

JJP1
Pyrite | Level 9

Thanks All for pointing me in right direction.

I learnt many things through all of you. thanks.

 

ChrisNZ
Tourmaline | Level 20

Please report your results.

What have you tried and what results did you get?

RichardDeVen
Barite | Level 11

When wetware is insufficient sometimes it is more cost effective and timely to improve the hardware.  More RAM, More SSD, More CPU, More Ghz

smantha
Lapis Lazuli | Level 10

regarding performance of the step that was presented there are three stages where the code is slow. 

1. If conditions: This is addressed by using arrays to some extent but not completely eliminated because of the KK scenario that is more complex

2. first. and last. processing is a tad bit slower (see if proc means/summary can be utilized for aggregating the new fields created by the by group statement and the new character fields created.

3. proc datasets or delete might be time consuming. In order to delete the data set AY. You could use a silly trick like

data AY;

a=.;

run;

This will delete all the data  in AY.

Hope this helps

   

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 26 replies
  • 3227 views
  • 9 likes
  • 12 in conversation