BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Kastchei
Pyrite | Level 9

Hello!

 

I am using RSubmit to utilize all 20 of the CPU cores on my local machine.  I have a very large dataset, and I have broken it up into small chunks to run on each of the cores.  For a test, I'm just running the first 100 observations (obs = 100) from each chunk.  The data steps run quickly, ~1-4 seconds, across my 20 cores.

 

After the datastep is run, I run 6 proc contents.  These procs are run on datasets with a lot of variables (~300) but not a lot of rows (~50-100) in my test.

 

When I run these proc contents locally, not part of the remote sessions, they run in a real time of ~1 seconds.  When I run these proc contents procedures in the remote sessions using RSubmit, they take about anywhere from 5-35 seconds real time.  Does anyone know of a reason why proc contents would be taking so long when submitted remotely?  I couldn't find any information (besides AI) about multi-threaded support in Proc Contents, so I am assuming the Proc Contents is not inherently multi-threaded; but if so, I'm not sure why another core running stuff would affect the run time of Proc Contents.  It's not a big deal for this code.  I can easily just wait to run them locally.  But it is perplexing why it's taking so long.

 

Here is the log for the first two Proc Contents run by Core 1.  The first proc contents is only 6 seconds.  I think this is because not all of the other remote sessions have been fired up, only some have started.  The second proc contents, however, is 24 seconds.  Core 2 is similar.

 

NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           6.15 seconds
      cpu time            0.10 seconds
      
NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           24.25 seconds
      cpu time            0.29 seconds

With Core 3, that first Proc Contents jumps up to 19 seconds.

 

NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           19.10 seconds
      cpu time            0.21 seconds
      
NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           21.97 seconds
      cpu time            0.29 seconds

The rest of the cores stay up around these times, getting a little longer each core.  Here's Core 12.

NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           24.16 seconds
      cpu time            0.29 seconds
      
NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           25.49 seconds
      cpu time            0.17 seconds

And then the last Core 20.

NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           21.71 seconds
      cpu time            0.18 seconds
      
NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           29.04 seconds
      cpu time            0.17 seconds

There is nothing substantially different about the different Cores.  They are all just running subsets of the same original dataset.  Some fluctuations seem normal among the Cores.  The puzzling part is the 10-30 fold jump in real time from local submission to RSubmit "remote" submission on the same machine.

 

For reference, here is a local example of the same proc contents that was run in Core 1 and then Core 20.  They run in less than a second.  This also gives the code.  You can see that both procs within a core are basically the same.  If anything, the second of the pair should run faster, since it drops numeric variables.

 

Core 1 equivalent locally.

32317       proc contents data = corechnk.core_1_chunk_1_forreview
32318                     out  = ___varsAll       (keep = memName name type varNum length) memtype = data noPrint;
32319       run;

NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent.
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.92 seconds
      user cpu time       0.00 seconds
      system cpu time     0.12 seconds
      memory              1558.12k
      OS Memory           48296.00k
      Timestamp           11/07/2025 01:49:12 PM
      Step Count                        2513  Switch Count  0


32320       proc contents data = corechnk.core_1_chunk_1_forreview           (drop = _numeric_)
32321                     out  = ___varsToSqueeze (keep = memName name type varNum length) memtype = data noPrint;
32322       run;

NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent.
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.89 seconds
      user cpu time       0.04 seconds
      system cpu time     0.18 seconds
      memory              1559.25k
      OS Memory           48296.00k
      Timestamp           11/07/2025 01:49:13 PM
      Step Count                        2514  Switch Count  0

Core 20 equivalent locally.

32323       proc contents data = corechnk.core_20_chunk_1_forreview
32324                     out  = ___varsAll       (keep = memName name type varNum length) memtype = data noPrint;
32325       run;

NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent.
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.88 seconds
      user cpu time       0.03 seconds
      system cpu time     0.14 seconds
      memory              1284.25k
      OS Memory           47980.00k
      Timestamp           11/07/2025 01:49:13 PM
      Step Count                        2515  Switch Count  0


32326       proc contents data = corechnk.core_20_chunk_1_forreview           (drop = _numeric_)
32327                     out  = ___varsToSqueeze (keep = memName name type varNum length) memtype = data noPrint;
32328       run;

NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent.
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.90 seconds
      user cpu time       0.01 seconds
      system cpu time     0.18 seconds
      memory              1250.84k
      OS Memory           47980.00k
      Timestamp           11/07/2025 01:49:14 PM
      Step Count                        2516  Switch Count  0

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kastchei
Pyrite | Level 9

Hello!

 

It took me a little while to get back to this test.  To answer your question, when I run with fewer cores, the slowdown is less.  It appears to be multiplicative.  1 core takes ~1 second.  2 cores ~2 seconds.  10 cores ~9-11 seconds.  After a bit more testing, I have narrowed down the cause.

 

memType = data

 

Adding the memType = data option to the proc contents statement seems to slow down the procedure, even in a local session.  In a remote session, the slowdown then gets multiplied by the number of sessions. 

 

This seems like an odd thing to cause a slowdown, but perhaps querying to see which names in the library are datasets instead of views takes time.  My solution seems to be to omit this option.  Below are logs if curious.

 

I ran these statements both locally and remotely.  All of the statements without memType run in hundredths of a second, and all those with memType run in more than a second.

 

proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                                                                                                    ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test1                                                                     ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test1Drop                                                                 ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test2a     (keep = memName name type varNum length)                       ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test2aDrop (keep = memName name type varNum length)                       ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test2b                                              memtype = data        ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test2bDrop                                          memtype = data        ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test2c                                                             noPrint;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test2cDrop                                                         noPrint;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test3a     (keep = memName name type varNum length) memtype = data        ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test3aDrop (keep = memName name type varNum length) memtype = data        ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test3b     (keep = memName name type varNum length)                noPrint;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test3bDrop (keep = memName name type varNum length)                noPrint;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test3c                                              memtype = data noPrint;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test3cDrop                                          memtype = data noPrint;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test4      (keep = memName name type varNum length) memtype = data noPrint;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test4Drop  (keep = memName name type varNum length) memtype = data noPrint;  run;

 

Local submission tests.

 

test0         real time  0.06 seconds    user cpu time  0.03 seconds    system cpu time  0.01 seconds    memory  1134.00k    OS Memory  28972.00k
test1         real time  0.07 seconds    user cpu time  0.03 seconds    system cpu time  0.04 seconds    memory  1446.71k    OS Memory  28972.00k
test1Drop     real time  0.05 seconds    user cpu time  0.00 seconds    system cpu time  0.01 seconds    memory  1375.40k    OS Memory  28972.00k
test2a        real time  0.06 seconds    user cpu time  0.06 seconds    system cpu time  0.00 seconds    memory  1344.12k    OS Memory  28972.00k
test2aDrop    real time  0.04 seconds    user cpu time  0.03 seconds    system cpu time  0.01 seconds    memory  1339.50k    OS Memory  28972.00k
test2b        real time  1.06 seconds    user cpu time  0.17 seconds    system cpu time  0.12 seconds    memory  1850.34k    OS Memory  29232.00k
test2bDrop    real time  1.04 seconds    user cpu time  0.10 seconds    system cpu time  0.21 seconds    memory  1790.12k    OS Memory  29232.00k
test2c        real time  0.01 seconds    user cpu time  0.00 seconds    system cpu time  0.01 seconds    memory  1253.87k    OS Memory  28972.00k
test2cDrop    real time  0.01 seconds    user cpu time  0.01 seconds    system cpu time  0.00 seconds    memory  1183.03k    OS Memory  28972.00k
test3a        real time  1.04 seconds    user cpu time  0.07 seconds    system cpu time  0.21 seconds    memory  1740.50k    OS Memory  29232.00k
test3aDrop    real time  1.03 seconds    user cpu time  0.04 seconds    system cpu time  0.18 seconds    memory  1740.31k    OS Memory  29232.00k
test3b        real time  0.01 seconds    user cpu time  0.00 seconds    system cpu time  0.00 seconds    memory  1145.43k    OS Memory  28972.00k
test3bDrop    real time  0.01 seconds    user cpu time  0.01 seconds    system cpu time  0.00 seconds    memory  1147.31k    OS Memory  28972.00k
test3c        real time  1.02 seconds    user cpu time  0.00 seconds    system cpu time  0.12 seconds    memory  1664.15k    OS Memory  29232.00k
test3cDrop    real time  1.04 seconds    user cpu time  0.07 seconds    system cpu time  0.12 seconds    memory  1593.34k    OS Memory  29232.00k
test4         real time  1.07 seconds    user cpu time  0.06 seconds    system cpu time  0.12 seconds    memory  1542.18k    OS Memory  29232.00k
test4Drop     real time  1.04 seconds    user cpu time  0.06 seconds    system cpu time  0.12 seconds    memory  1507.18k    OS Memory  29232.00k

Remote submission 1 of the 20-remote-submission run.

 

 

test0         real time   0.59 seconds    user cpu time  0.03 seconds    system cpu time  0.00 seconds    memory  11439.18k    OS Memory  21028.00k
test1         real time   0.08 seconds    user cpu time  0.00 seconds    system cpu time  0.00 seconds    memory   9563.59k    OS Memory  20772.00k
test1Drop     real time   0.11 seconds    user cpu time  0.00 seconds    system cpu time  0.01 seconds    memory   9554.00k    OS Memory  20516.00k
test2a        real time   0.07 seconds    user cpu time  0.00 seconds    system cpu time  0.01 seconds    memory   9499.09k    OS Memory  20260.00k
test2aDrop    real time   0.07 seconds    user cpu time  0.01 seconds    system cpu time  0.00 seconds    memory   9500.78k    OS Memory  20260.00k
test2b        real time  21.13 seconds    user cpu time  0.06 seconds    system cpu time  0.14 seconds    memory   9966.53k    OS Memory  20776.00k
test2bDrop    real time  23.15 seconds    user cpu time  0.01 seconds    system cpu time  0.26 seconds    memory   9939.46k    OS Memory  20776.00k
test2c        real time   0.31 seconds    user cpu time  0.00 seconds    system cpu time  0.00 seconds    memory   9378.31k    OS Memory  20516.00k
test2cDrop    real time   0.08 seconds    user cpu time  0.01 seconds    system cpu time  0.00 seconds    memory   9378.50k    OS Memory  20516.00k
test3a        real time  25.20 seconds    user cpu time  0.03 seconds    system cpu time  0.31 seconds    memory   9872.37k    OS Memory  20776.00k
test3aDrop    real time  24.33 seconds    user cpu time  0.04 seconds    system cpu time  0.31 seconds    memory   9873.18k    OS Memory  20776.00k
test3b        real time   0.62 seconds    user cpu time  0.01 seconds    system cpu time  0.00 seconds    memory   9326.59k    OS Memory  20516.00k
test3bDrop    real time   0.08 seconds    user cpu time  0.00 seconds    system cpu time  0.00 seconds    memory   9326.21k    OS Memory  20516.00k
test3c        real time  27.00 seconds    user cpu time  0.06 seconds    system cpu time  0.25 seconds    memory   9752.56k    OS Memory  20776.00k
test3cDrop    real time  24.23 seconds    user cpu time  0.01 seconds    system cpu time  0.25 seconds    memory   9788.90k    OS Memory  20776.00k
test4         real time  22.64 seconds    user cpu time  0.00 seconds    system cpu time  0.31 seconds    memory   9687.06k    OS Memory  20776.00k
test4Drop     real time  21.89 seconds    user cpu time  0.03 seconds    system cpu time  0.21 seconds    memory   9686.68k    OS Memory  20776.00k

 

View solution in original post

15 REPLIES 15
Kurt_Bremser
Super User

There's something seriously amiss with your SAS setup/computer.

Just for comparison:

 69         proc contents data=sashelp.baseball out=ba;
 70         run;
 
 NOTE: The data set WORK.BA has 24 observations and 41 variables.
 NOTE:  Verwendet wurde: PROZEDUR CONTENTS - (Gesamtverarbeitungszeit):
       real time           0.03 seconds
       user cpu time       0.03 seconds
       system cpu time     0.00 seconds

This is from SAS On Demand.

As you can see, PROC CONTENTS takes nearly no time at all, while even under "ideal" conditions it almost takes a full second (real time) on your site, much longer that the CPU time, so you have a lot of wait states there.

What computer/server do you run SAS on?

I suspect that your additional SAS processes cause paging on your computer, and with every context switch the whole process needs to be reloaded from disk cache, which would explain the excessive delay.

Kastchei
Pyrite | Level 9

Thanks!

 

It's SAS 9.4 TS1M6 on Windows 11 Professional.  PC SAS, not a server.  My take is the 1 second of time is just the time for SAS to pop open the Results window.  I haven't been too bothered by that in general while programming.  We did just upgrade from Windows 10 to Windows 11.  I don't notice much difference except when executing %sysExec commands; they used to be almost instantaneous, but now they take about a second each, which isn't really noticeable unless I'm processing a lot.

 

When you talk about disk cache, I assume you mean that SAS is running out of memory so writing things to disk instead?  I'll have to check out my memory  when I run something like this again.  I know it maxes out my CPU usage (obviously), but I hadn't noticed any memory bottlenecks.  My settings are 

 

options noCenter
compress = binary
cpuCount = actual
dmsSynChk
errorCheck = strict
noFmtErr
fullSTimer
iBufSize = max
lineSize = 256
mergeNoBy = error
msgLevel = i
pageSize = max
sortDup = logical
sortValidate
sortPgm = sas
syntaxCheck
threads
varLenChk = error
noWorkTerm;

 

Best regards,

Michael

SASKiwi
PROC Star

What you are doing on your PC suggests to me that your multiple SAS sessions are IO bound with the PROC CONTENTS processes waiting until your permanent storage IO channels free up. I suspect your primary SAS session is first in the IO queue so is not delayed as much. Try reducing your number of remote sessions - does that reduce the lag? Also I hope you are using the WAIT = NO option on your RSUBMITs otherwise they wont run in parallel.

Kastchei
Pyrite | Level 9

Howdy!

 

Ah, that makes a lot of sense: while the other cores are writing stuff to disc, proc contents has to wait for a break to sneak in there.  Thanks so much.  I'll check out the fewer remote session to see at what point it starts to have an impact.

 

Yes, I'm using wait = no, so they are all completing together at about the same time after several hours.

Kastchei
Pyrite | Level 9

Hello!

 

It took me a little while to get back to this test.  To answer your question, when I run with fewer cores, the slowdown is less.  It appears to be multiplicative.  1 core takes ~1 second.  2 cores ~2 seconds.  10 cores ~9-11 seconds.  After a bit more testing, I have narrowed down the cause.

 

memType = data

 

Adding the memType = data option to the proc contents statement seems to slow down the procedure, even in a local session.  In a remote session, the slowdown then gets multiplied by the number of sessions. 

 

This seems like an odd thing to cause a slowdown, but perhaps querying to see which names in the library are datasets instead of views takes time.  My solution seems to be to omit this option.  Below are logs if curious.

 

I ran these statements both locally and remotely.  All of the statements without memType run in hundredths of a second, and all those with memType run in more than a second.

 

proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                                                                                                    ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test1                                                                     ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test1Drop                                                                 ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test2a     (keep = memName name type varNum length)                       ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test2aDrop (keep = memName name type varNum length)                       ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test2b                                              memtype = data        ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test2bDrop                                          memtype = data        ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test2c                                                             noPrint;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test2cDrop                                                         noPrint;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test3a     (keep = memName name type varNum length) memtype = data        ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test3aDrop (keep = memName name type varNum length) memtype = data        ;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test3b     (keep = memName name type varNum length)                noPrint;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test3bDrop (keep = memName name type varNum length)                noPrint;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test3c                                              memtype = data noPrint;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test3cDrop                                          memtype = data noPrint;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview                    out = test4      (keep = memName name type varNum length) memtype = data noPrint;  run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test4Drop  (keep = memName name type varNum length) memtype = data noPrint;  run;

 

Local submission tests.

 

test0         real time  0.06 seconds    user cpu time  0.03 seconds    system cpu time  0.01 seconds    memory  1134.00k    OS Memory  28972.00k
test1         real time  0.07 seconds    user cpu time  0.03 seconds    system cpu time  0.04 seconds    memory  1446.71k    OS Memory  28972.00k
test1Drop     real time  0.05 seconds    user cpu time  0.00 seconds    system cpu time  0.01 seconds    memory  1375.40k    OS Memory  28972.00k
test2a        real time  0.06 seconds    user cpu time  0.06 seconds    system cpu time  0.00 seconds    memory  1344.12k    OS Memory  28972.00k
test2aDrop    real time  0.04 seconds    user cpu time  0.03 seconds    system cpu time  0.01 seconds    memory  1339.50k    OS Memory  28972.00k
test2b        real time  1.06 seconds    user cpu time  0.17 seconds    system cpu time  0.12 seconds    memory  1850.34k    OS Memory  29232.00k
test2bDrop    real time  1.04 seconds    user cpu time  0.10 seconds    system cpu time  0.21 seconds    memory  1790.12k    OS Memory  29232.00k
test2c        real time  0.01 seconds    user cpu time  0.00 seconds    system cpu time  0.01 seconds    memory  1253.87k    OS Memory  28972.00k
test2cDrop    real time  0.01 seconds    user cpu time  0.01 seconds    system cpu time  0.00 seconds    memory  1183.03k    OS Memory  28972.00k
test3a        real time  1.04 seconds    user cpu time  0.07 seconds    system cpu time  0.21 seconds    memory  1740.50k    OS Memory  29232.00k
test3aDrop    real time  1.03 seconds    user cpu time  0.04 seconds    system cpu time  0.18 seconds    memory  1740.31k    OS Memory  29232.00k
test3b        real time  0.01 seconds    user cpu time  0.00 seconds    system cpu time  0.00 seconds    memory  1145.43k    OS Memory  28972.00k
test3bDrop    real time  0.01 seconds    user cpu time  0.01 seconds    system cpu time  0.00 seconds    memory  1147.31k    OS Memory  28972.00k
test3c        real time  1.02 seconds    user cpu time  0.00 seconds    system cpu time  0.12 seconds    memory  1664.15k    OS Memory  29232.00k
test3cDrop    real time  1.04 seconds    user cpu time  0.07 seconds    system cpu time  0.12 seconds    memory  1593.34k    OS Memory  29232.00k
test4         real time  1.07 seconds    user cpu time  0.06 seconds    system cpu time  0.12 seconds    memory  1542.18k    OS Memory  29232.00k
test4Drop     real time  1.04 seconds    user cpu time  0.06 seconds    system cpu time  0.12 seconds    memory  1507.18k    OS Memory  29232.00k

Remote submission 1 of the 20-remote-submission run.

 

 

test0         real time   0.59 seconds    user cpu time  0.03 seconds    system cpu time  0.00 seconds    memory  11439.18k    OS Memory  21028.00k
test1         real time   0.08 seconds    user cpu time  0.00 seconds    system cpu time  0.00 seconds    memory   9563.59k    OS Memory  20772.00k
test1Drop     real time   0.11 seconds    user cpu time  0.00 seconds    system cpu time  0.01 seconds    memory   9554.00k    OS Memory  20516.00k
test2a        real time   0.07 seconds    user cpu time  0.00 seconds    system cpu time  0.01 seconds    memory   9499.09k    OS Memory  20260.00k
test2aDrop    real time   0.07 seconds    user cpu time  0.01 seconds    system cpu time  0.00 seconds    memory   9500.78k    OS Memory  20260.00k
test2b        real time  21.13 seconds    user cpu time  0.06 seconds    system cpu time  0.14 seconds    memory   9966.53k    OS Memory  20776.00k
test2bDrop    real time  23.15 seconds    user cpu time  0.01 seconds    system cpu time  0.26 seconds    memory   9939.46k    OS Memory  20776.00k
test2c        real time   0.31 seconds    user cpu time  0.00 seconds    system cpu time  0.00 seconds    memory   9378.31k    OS Memory  20516.00k
test2cDrop    real time   0.08 seconds    user cpu time  0.01 seconds    system cpu time  0.00 seconds    memory   9378.50k    OS Memory  20516.00k
test3a        real time  25.20 seconds    user cpu time  0.03 seconds    system cpu time  0.31 seconds    memory   9872.37k    OS Memory  20776.00k
test3aDrop    real time  24.33 seconds    user cpu time  0.04 seconds    system cpu time  0.31 seconds    memory   9873.18k    OS Memory  20776.00k
test3b        real time   0.62 seconds    user cpu time  0.01 seconds    system cpu time  0.00 seconds    memory   9326.59k    OS Memory  20516.00k
test3bDrop    real time   0.08 seconds    user cpu time  0.00 seconds    system cpu time  0.00 seconds    memory   9326.21k    OS Memory  20516.00k
test3c        real time  27.00 seconds    user cpu time  0.06 seconds    system cpu time  0.25 seconds    memory   9752.56k    OS Memory  20776.00k
test3cDrop    real time  24.23 seconds    user cpu time  0.01 seconds    system cpu time  0.25 seconds    memory   9788.90k    OS Memory  20776.00k
test4         real time  22.64 seconds    user cpu time  0.00 seconds    system cpu time  0.31 seconds    memory   9687.06k    OS Memory  20776.00k
test4Drop     real time  21.89 seconds    user cpu time  0.03 seconds    system cpu time  0.21 seconds    memory   9686.68k    OS Memory  20776.00k

 

Quentin
Super User

Looks like you've got a solution.  I suggest you mark your answer as correct / accepted.

 

I'm surprised that memtype=data would slow things down that much, but then I've never used that option on PROC CONTENTS.  Looks like it tells SAS to pull some metadata from every dataset that exists in the library.

Tom
Super User Tom
Super User

There is no reason to include that option when you have asked for a specific member.   You cannot have both a VIEW and a DATASET in one library that have the same member name.

 

The purpose of MEMTYPE= option would be to limit which members it checks when you use _ALL_ as the member name. 

Kastchei
Pyrite | Level 9
Great. I figured as much. I was using a macro that someone else had written, so I copied and edited it to remove that option.
Quentin
Super User

So you've got SAS 9 running on your local PC, and you're using RSUBMIT to submit chunks of code to different cores on your local PC?  So there's no network or remote server involved at all?

Can you post a little example where you run PROC CONTENTS in your main session and then run it again in a child session and see the difference in execution times?

 

I haven't used RSUBMIT to spawn local sessions before.  Is the code something like:

 

options autosignon sascmd="C:\Program Files\SASHome\SASFoundation\9.4\sas.exe" ;
rsubmit foo wait=no;
options fullstimer ;
data foo ;
  array x{300} ;
run ;
proc contents data=foo ;
run ;
endrsubmit ;
signoff ;

?

 

When I run that, the PROC CONTENTS step completes in .04 seconds:

 

1    options autosignon sascmd="C:\Program Files\SASHome\SASFoundation\9.4\sas.exe" ;
2    rsubmit foo wait=no;
NOTE: Remote signon to FOO commencing (SAS Release 9.04.01M7P080520).
NOTE: Remote signon to FOO complete.
NOTE: Background remote submit to FOO in progress.
3    signoff ;
NOTE: Remote submit to FOO commencing.
1    options fullstimer ;
2    data foo ;
3      array x{300} ;
4    run ;

NOTE: The data set WORK.FOO has 1 observations and 300 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      user cpu time       0.00 seconds
      system cpu time     0.00 seconds
      memory              1804.71k
      OS Memory           8660.00k
      Timestamp           11/10/2025 05:06:49 PM
      Step Count                        1  Switch Count  0


5    proc contents data=foo ;
6    run ;

NOTE: Non-portable document will be produced. The current settings of FORMCHAR use nonstandard line-drawing characters and the
      resulting output file will not render correctly unless all readers of the document have the SAS Monospace font installed.
      To make your document portable, issue the following command:
      OPTIONS FORMCHAR="|----|+|---+=|-/\<>*";

NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.04 seconds
      user cpu time       0.00 seconds
      system cpu time     0.00 seconds
      memory              3294.53k
      OS Memory           10964.00k
      Timestamp           11/10/2025 05:06:49 PM
      Step Count                        2  Switch Count  1

NOTE: The PROCEDURE CONTENTS printed pages 1-6.

NOTE: Remote submit to FOO complete.
NOTE: Remote signoff from FOO commencing.
NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414
NOTE: The SAS System used:
      real time           0.94 seconds
      user cpu time       0.01 seconds
      system cpu time     0.01 seconds
      memory              10381.56k
      OS Memory           10964.00k
      Timestamp           11/10/2025 05:06:49 PM
      Step Count                        2  Switch Count  45

NOTE: Remote signoff from FOO complete.

 

I'm still struggling to imagine how PROC CONTENTS could take a full second to execute, since all it does is read dataset metadata.  Is it just PROC CONTENTS that runs surprisingly slow, or do other PROCS  run slow?  How about DATA steps?

 

I wouldn't switch to dictionary tables.  In my experience PROC CONTENTS is usually faster than dictionary tables.

Kastchei
Pyrite | Level 9

Yes, you have the syntax.  Each new locally "remote" SAS session winds up using a different CPU core on your computer.  What I was working on probably would have taken a few days to run.  Using all 20 of my cores, it took ~5 hours, which isn't 1/20th of the time, but still a lot faster.  It can now run over night as opposed to all weekend.  You have to split your dataset up into chunks so that each core can run on a different part of the dataset.  Of course, you can't have code that requires a copy of the full dataset (e.g. no retain, lag(), etc.).

 

Here's my code.  I have 20 cores and 23 chunks per core.  That way, in case my computer crashes, I will have at least some of it complete and saved permanently already.  (The comments are for me; you probably already know the stuff they say.)

 

 

* Split text data up into separate datasets so that multiple SAS sessions can be employed to increase CPU usages.  Otherwise, SAS only uses one core out of 20, and this would take a long time to run.  *;;
* While we can only use 20 cores at once, we can still break up the data into smaller chunks and run them sequentially on each core, while the sequential series of chunks are run in parallel.  This     *;
* way, if something happens to the process, we will have some results and will not need to start from the beginning.                                                                            *;

* Determine the rows in the input data.  This only works if saved as a SAS dataset.  It does not work on the zipped view. *;;
data _null_;
    * This set statement never executes, so no records are read, but the header information about the set dataset is read in. *;
    if 0
        then set _3_NonNullWithLetters nObs = n;
    call symputx('nRows',n);
    stop;  * Stops infinite loop from never reaching the end of a file. *;
run;  * nRows = 45980013 *;
%put nRows = &nRows;

%let nObsPerChunk   = 100000;
%let nChunks        = %sysFunc(ceil(&nRows/&nObsPerChunk));
%let nCores         = 20;
%let nChunksPerCore = %sysFunc(ceil(&nChunks/&nCores));
%put nObs = &nRows nObsPerChunk = *&nObsPerChunk* nChunks = &nChunks nCores = *&nCores* nChunksPerCore = &nChunksPerCore;

* Create the chunks!  Split the data into separate datasets so that each core can process difference data simultaneously. *;;
%macro chunks;
    data %do chunkA = 1 %to &nChunksPerCore;
             %do coreA = 1 %to &nCores;
                 CoreChnk.Core_&coreA._Chunk_&chunkA (compress = binary)
             %end;
         %end;;
        set Michael._3_NonNullWithLetters;

        * We want to fill up the first chunk of each core first, then the second, etc. so that they all have as equal number of chunks as possible as *;
        * opposed to the first cores-1 having the total number of chunks, and the last core having maybe as few as 1.                                 *;
        select;
            %do chunkB = 1 %to &nChunksPerCore;
                %do coreB = 1 %to &nCores;
                    %let chunkIndex = %sysEvalF(&nCores*(&chunkB - 1) + &coreB);
                    when (%sysEvalF(&nObsPerChunk*(&chunkIndex - 1)) < _n_ <= %sysEvalF(&nObsPerChunk*&chunkIndex))  output CoreChnk.Core_&coreB._Chunk_&chunkB;
                %end;
            %end;
        end;
    run;


* Fix capitalization in dataset names and set to read-only. *; options noXWait noXSync; %local xString; %do coreC = 1 %to &nCores; %let xString = cd "%sysFunc(pathName(CoreChnk))\"; %do chunkC = 1 %to &nChunksPerCore; %let xString = %sysFunc(compBl(&xString & attrib -R "Core_&coreC._Chunk_&chunkC..sas7bdat" )); %end; %sysExec &xString; %end; %sleep(20); %do coreC = 1 %to &nCores; %let xString = cd "%sysFunc(pathName(CoreChnk))\"; %do chunkC = 1 %to &nChunksPerCore; %let xString = %sysFunc(compBl(&xString & rename "Core_&coreC._Chunk_&chunkC..sas7bdat" "Core_&coreC._Chunk_&chunkC..sas7bdat")); %end; %sysExec &xString; %end; %sleep(10); %do coreC = 1 %to &nCores; %let xString = cd "%sysFunc(pathName(CoreChnk))\"; %do chunkC = 1 %to &nChunksPerCore; %let xString = %sysFunc(compBl(&xString & attrib +R "Core_&coreC._Chunk_&chunkC..sas7bdat")); %end; %sysExec &xString; %end; options noXWait xSync; %mEnd chunks; options mPrint; %chunks; options noMPrint; * Macro to run all the cores and chunks. *; %macro coreChunk; options sasCmd = "sas"; * Current datetime. *; %let ___startDT = %sysFunc(datetime()); * Delete the results from previous runs. *; proc datasets library = CoreChnk noList; delete %do core = 1 %to &nCores; %do chunk = 1 %to &nChunksPerCore; Core_&core._Chunk_&chunk._ForReview Core_&core._Chunk_&chunk._NoReview Core_&core._Chunk_&chunk._ForReviewDetail %end; %end; ; quit; %do core = 1 %to &nCores; signOn core&core;
* Pass macro variables to the remote session. *; %sysLPut core = &core / remote = core&core; %sysLPut nChunksPerCore = &nChunksPerCore / remote = core&core;
* Pass the libName to the remote session, and submit code to the remote session. *; rSubmit core&core wait = no inheritLib = (CoreChnk); * Redirect the log to an external file *; proc printTo log = "K:\DAplay\Michael\ECS\MGUS Text Search\Core Chunk/Core_&core..log" new; run; options fullSTimer; * Must define the macros within the remote session. *; %include 'K:\Michael\Search all text fields everywhere - Match Macro.sas' / lRecL = 4096; %include 'K:\Michael\Search all text fields everywhere - Chunk Macro.sas' / lRecL = 4096; %include 'K:\Support Macros\squeeze_1.sas' / lRecL = 4096; %chunk; proc printTo log = log; run; /* %sysRPut ___startDT_&core = &___startDT;*/ endRSubmit; %end; waitfor _all_; signoff _all_; * Print total duration. *; data _null_; dur = datetime() - &___startDT; put 30*'-' / ' TOTAL DURATION:' dur time13.2 / 30*'-'; run; %mEnd coreChunk; %coreChunk;

 

 

Kurt_Bremser
Super User

It's not SAS that runs out of memory, it's the system. Which then pages out momentarily unused processes to disk, and that is noticeably slower on Windows than on other systems (Windows uses a file in the filesystem, UNIX a fixed area on a raw device). And these processes need to be reloaded from disk as soon as they need to become active again. So you never want this to happen, as it always results in a massive performance penalty.

Also, do not consider the reported number of "cores" as the baseline for setting up multi-processing. With hyperthreading-capable CPUs you get more virtual cores, but these need to be utilized by software designed for those. You need to consider number of threads, not number of processes.

AhmedAl_Attar
Ammonite | Level 13

Hi @Kastchei 

Regardless of the size of your data set, most of the information provided by Proc Contents are already stored in the SAS supplied data dictionary tables/view!

 

Proc contents displays the metadata (data dictionary) of a SAS dataset, including variable names, types, lengths, and attributes like labels and formats. It is also used to view the structure of a library or a specific table, showing information such as the number of observations, variables, and when the dataset was created. This makes it a valuable tool for understanding a dataset, verifying that data was imported correctly, and performing more intelligent data processing.

 

Guess what, as soon as you assign a libname that contains your SAS data set(s), SAS behind the scenes gathers all the metadata about your SAS data sets.

Check this paper Exploring DICTIONARY Tables and Views for more details.

 

Note: Running proc contents does not require splitting your data set, regardless of how big it is.

 

Hope this helps

Kastchei
Pyrite | Level 9
Thanks, Ahmed! Yeah, I suppose I could switch to dictionary.columns to see if it runs any quicker. I also can probably tweak my autocall macro for this project to not need the proc contents in every remote session, since that information isn't changing. I could just do it once and hard code the results.
Kurt_Bremser
Super User

Also keep in mind that all that PROC CONTENTS neeeds to do is reading the first page of the dataset, where all metadata is stored. And the ressulting dataset will also rarely be larger than one page (unless you try to compress it, which is counterproductive as can be seen in your log). So anything beyond 0.1 seconds CPU and real time points to some issue with your environment, or congestion on your server.

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 15 replies
  • 691 views
  • 16 likes
  • 6 in conversation