Hello!
I am using RSubmit to utilize all 20 of the CPU cores on my local machine. I have a very large dataset, and I have broken it up into small chunks to run on each of the cores. For a test, I'm just running the first 100 observations (obs = 100) from each chunk. The data steps run quickly, ~1-4 seconds, across my 20 cores.
After the datastep is run, I run 6 proc contents. These procs are run on datasets with a lot of variables (~300) but not a lot of rows (~50-100) in my test.
When I run these proc contents locally, not part of the remote sessions, they run in a real time of ~1 seconds. When I run these proc contents procedures in the remote sessions using RSubmit, they take about anywhere from 5-35 seconds real time. Does anyone know of a reason why proc contents would be taking so long when submitted remotely? I couldn't find any information (besides AI) about multi-threaded support in Proc Contents, so I am assuming the Proc Contents is not inherently multi-threaded; but if so, I'm not sure why another core running stuff would affect the run time of Proc Contents. It's not a big deal for this code. I can easily just wait to run them locally. But it is perplexing why it's taking so long.
Here is the log for the first two Proc Contents run by Core 1. The first proc contents is only 6 seconds. I think this is because not all of the other remote sessions have been fired up, only some have started. The second proc contents, however, is 24 seconds. Core 2 is similar.
NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 6.15 seconds
cpu time 0.10 seconds
NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 24.25 seconds
cpu time 0.29 seconds
With Core 3, that first Proc Contents jumps up to 19 seconds.
NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 19.10 seconds
cpu time 0.21 seconds
NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 21.97 seconds
cpu time 0.29 seconds
The rest of the cores stay up around these times, getting a little longer each core. Here's Core 12.
NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 24.16 seconds
cpu time 0.29 seconds
NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 25.49 seconds
cpu time 0.17 seconds
And then the last Core 20.
NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 21.71 seconds
cpu time 0.18 seconds
NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 29.04 seconds
cpu time 0.17 seconds
There is nothing substantially different about the different Cores. They are all just running subsets of the same original dataset. Some fluctuations seem normal among the Cores. The puzzling part is the 10-30 fold jump in real time from local submission to RSubmit "remote" submission on the same machine.
For reference, here is a local example of the same proc contents that was run in Core 1 and then Core 20. They run in less than a second. This also gives the code. You can see that both procs within a core are basically the same. If anything, the second of the pair should run faster, since it drops numeric variables.
Core 1 equivalent locally.
32317 proc contents data = corechnk.core_1_chunk_1_forreview
32318 out = ___varsAll (keep = memName name type varNum length) memtype = data noPrint;
32319 run;
NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.92 seconds
user cpu time 0.00 seconds
system cpu time 0.12 seconds
memory 1558.12k
OS Memory 48296.00k
Timestamp 11/07/2025 01:49:12 PM
Step Count 2513 Switch Count 0
32320 proc contents data = corechnk.core_1_chunk_1_forreview (drop = _numeric_)
32321 out = ___varsToSqueeze (keep = memName name type varNum length) memtype = data noPrint;
32322 run;
NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.89 seconds
user cpu time 0.04 seconds
system cpu time 0.18 seconds
memory 1559.25k
OS Memory 48296.00k
Timestamp 11/07/2025 01:49:13 PM
Step Count 2514 Switch Count 0
Core 20 equivalent locally.
32323 proc contents data = corechnk.core_20_chunk_1_forreview
32324 out = ___varsAll (keep = memName name type varNum length) memtype = data noPrint;
32325 run;
NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.88 seconds
user cpu time 0.03 seconds
system cpu time 0.14 seconds
memory 1284.25k
OS Memory 47980.00k
Timestamp 11/07/2025 01:49:13 PM
Step Count 2515 Switch Count 0
32326 proc contents data = corechnk.core_20_chunk_1_forreview (drop = _numeric_)
32327 out = ___varsToSqueeze (keep = memName name type varNum length) memtype = data noPrint;
32328 run;
NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.90 seconds
user cpu time 0.01 seconds
system cpu time 0.18 seconds
memory 1250.84k
OS Memory 47980.00k
Timestamp 11/07/2025 01:49:14 PM
Step Count 2516 Switch Count 0
Hello!
It took me a little while to get back to this test. To answer your question, when I run with fewer cores, the slowdown is less. It appears to be multiplicative. 1 core takes ~1 second. 2 cores ~2 seconds. 10 cores ~9-11 seconds. After a bit more testing, I have narrowed down the cause.
memType = data
Adding the memType = data option to the proc contents statement seems to slow down the procedure, even in a local session. In a remote session, the slowdown then gets multiplied by the number of sessions.
This seems like an odd thing to cause a slowdown, but perhaps querying to see which names in the library are datasets instead of views takes time. My solution seems to be to omit this option. Below are logs if curious.
I ran these statements both locally and remotely. All of the statements without memType run in hundredths of a second, and all those with memType run in more than a second.
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test1 ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test1Drop ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test2a (keep = memName name type varNum length) ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test2aDrop (keep = memName name type varNum length) ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test2b memtype = data ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test2bDrop memtype = data ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test2c noPrint; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test2cDrop noPrint; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test3a (keep = memName name type varNum length) memtype = data ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test3aDrop (keep = memName name type varNum length) memtype = data ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test3b (keep = memName name type varNum length) noPrint; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test3bDrop (keep = memName name type varNum length) noPrint; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test3c memtype = data noPrint; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test3cDrop memtype = data noPrint; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test4 (keep = memName name type varNum length) memtype = data noPrint; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test4Drop (keep = memName name type varNum length) memtype = data noPrint; run;
Local submission tests.
test0 real time 0.06 seconds user cpu time 0.03 seconds system cpu time 0.01 seconds memory 1134.00k OS Memory 28972.00k
test1 real time 0.07 seconds user cpu time 0.03 seconds system cpu time 0.04 seconds memory 1446.71k OS Memory 28972.00k
test1Drop real time 0.05 seconds user cpu time 0.00 seconds system cpu time 0.01 seconds memory 1375.40k OS Memory 28972.00k
test2a real time 0.06 seconds user cpu time 0.06 seconds system cpu time 0.00 seconds memory 1344.12k OS Memory 28972.00k
test2aDrop real time 0.04 seconds user cpu time 0.03 seconds system cpu time 0.01 seconds memory 1339.50k OS Memory 28972.00k
test2b real time 1.06 seconds user cpu time 0.17 seconds system cpu time 0.12 seconds memory 1850.34k OS Memory 29232.00k
test2bDrop real time 1.04 seconds user cpu time 0.10 seconds system cpu time 0.21 seconds memory 1790.12k OS Memory 29232.00k
test2c real time 0.01 seconds user cpu time 0.00 seconds system cpu time 0.01 seconds memory 1253.87k OS Memory 28972.00k
test2cDrop real time 0.01 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 1183.03k OS Memory 28972.00k
test3a real time 1.04 seconds user cpu time 0.07 seconds system cpu time 0.21 seconds memory 1740.50k OS Memory 29232.00k
test3aDrop real time 1.03 seconds user cpu time 0.04 seconds system cpu time 0.18 seconds memory 1740.31k OS Memory 29232.00k
test3b real time 0.01 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 1145.43k OS Memory 28972.00k
test3bDrop real time 0.01 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 1147.31k OS Memory 28972.00k
test3c real time 1.02 seconds user cpu time 0.00 seconds system cpu time 0.12 seconds memory 1664.15k OS Memory 29232.00k
test3cDrop real time 1.04 seconds user cpu time 0.07 seconds system cpu time 0.12 seconds memory 1593.34k OS Memory 29232.00k
test4 real time 1.07 seconds user cpu time 0.06 seconds system cpu time 0.12 seconds memory 1542.18k OS Memory 29232.00k
test4Drop real time 1.04 seconds user cpu time 0.06 seconds system cpu time 0.12 seconds memory 1507.18k OS Memory 29232.00k
Remote submission 1 of the 20-remote-submission run.
test0 real time 0.59 seconds user cpu time 0.03 seconds system cpu time 0.00 seconds memory 11439.18k OS Memory 21028.00k
test1 real time 0.08 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 9563.59k OS Memory 20772.00k
test1Drop real time 0.11 seconds user cpu time 0.00 seconds system cpu time 0.01 seconds memory 9554.00k OS Memory 20516.00k
test2a real time 0.07 seconds user cpu time 0.00 seconds system cpu time 0.01 seconds memory 9499.09k OS Memory 20260.00k
test2aDrop real time 0.07 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 9500.78k OS Memory 20260.00k
test2b real time 21.13 seconds user cpu time 0.06 seconds system cpu time 0.14 seconds memory 9966.53k OS Memory 20776.00k
test2bDrop real time 23.15 seconds user cpu time 0.01 seconds system cpu time 0.26 seconds memory 9939.46k OS Memory 20776.00k
test2c real time 0.31 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 9378.31k OS Memory 20516.00k
test2cDrop real time 0.08 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 9378.50k OS Memory 20516.00k
test3a real time 25.20 seconds user cpu time 0.03 seconds system cpu time 0.31 seconds memory 9872.37k OS Memory 20776.00k
test3aDrop real time 24.33 seconds user cpu time 0.04 seconds system cpu time 0.31 seconds memory 9873.18k OS Memory 20776.00k
test3b real time 0.62 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 9326.59k OS Memory 20516.00k
test3bDrop real time 0.08 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 9326.21k OS Memory 20516.00k
test3c real time 27.00 seconds user cpu time 0.06 seconds system cpu time 0.25 seconds memory 9752.56k OS Memory 20776.00k
test3cDrop real time 24.23 seconds user cpu time 0.01 seconds system cpu time 0.25 seconds memory 9788.90k OS Memory 20776.00k
test4 real time 22.64 seconds user cpu time 0.00 seconds system cpu time 0.31 seconds memory 9687.06k OS Memory 20776.00k
test4Drop real time 21.89 seconds user cpu time 0.03 seconds system cpu time 0.21 seconds memory 9686.68k OS Memory 20776.00k
There's something seriously amiss with your SAS setup/computer.
Just for comparison:
69 proc contents data=sashelp.baseball out=ba;
70 run;
NOTE: The data set WORK.BA has 24 observations and 41 variables.
NOTE: Verwendet wurde: PROZEDUR CONTENTS - (Gesamtverarbeitungszeit):
real time 0.03 seconds
user cpu time 0.03 seconds
system cpu time 0.00 seconds
This is from SAS On Demand.
As you can see, PROC CONTENTS takes nearly no time at all, while even under "ideal" conditions it almost takes a full second (real time) on your site, much longer that the CPU time, so you have a lot of wait states there.
What computer/server do you run SAS on?
I suspect that your additional SAS processes cause paging on your computer, and with every context switch the whole process needs to be reloaded from disk cache, which would explain the excessive delay.
Thanks!
It's SAS 9.4 TS1M6 on Windows 11 Professional. PC SAS, not a server. My take is the 1 second of time is just the time for SAS to pop open the Results window. I haven't been too bothered by that in general while programming. We did just upgrade from Windows 10 to Windows 11. I don't notice much difference except when executing %sysExec commands; they used to be almost instantaneous, but now they take about a second each, which isn't really noticeable unless I'm processing a lot.
When you talk about disk cache, I assume you mean that SAS is running out of memory so writing things to disk instead? I'll have to check out my memory when I run something like this again. I know it maxes out my CPU usage (obviously), but I hadn't noticed any memory bottlenecks. My settings are
options noCenter
compress = binary
cpuCount = actual
dmsSynChk
errorCheck = strict
noFmtErr
fullSTimer
iBufSize = max
lineSize = 256
mergeNoBy = error
msgLevel = i
pageSize = max
sortDup = logical
sortValidate
sortPgm = sas
syntaxCheck
threads
varLenChk = error
noWorkTerm;
Best regards,
Michael
What you are doing on your PC suggests to me that your multiple SAS sessions are IO bound with the PROC CONTENTS processes waiting until your permanent storage IO channels free up. I suspect your primary SAS session is first in the IO queue so is not delayed as much. Try reducing your number of remote sessions - does that reduce the lag? Also I hope you are using the WAIT = NO option on your RSUBMITs otherwise they wont run in parallel.
Howdy!
Ah, that makes a lot of sense: while the other cores are writing stuff to disc, proc contents has to wait for a break to sneak in there. Thanks so much. I'll check out the fewer remote session to see at what point it starts to have an impact.
Yes, I'm using wait = no, so they are all completing together at about the same time after several hours.
Hello!
It took me a little while to get back to this test. To answer your question, when I run with fewer cores, the slowdown is less. It appears to be multiplicative. 1 core takes ~1 second. 2 cores ~2 seconds. 10 cores ~9-11 seconds. After a bit more testing, I have narrowed down the cause.
memType = data
Adding the memType = data option to the proc contents statement seems to slow down the procedure, even in a local session. In a remote session, the slowdown then gets multiplied by the number of sessions.
This seems like an odd thing to cause a slowdown, but perhaps querying to see which names in the library are datasets instead of views takes time. My solution seems to be to omit this option. Below are logs if curious.
I ran these statements both locally and remotely. All of the statements without memType run in hundredths of a second, and all those with memType run in more than a second.
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test1 ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test1Drop ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test2a (keep = memName name type varNum length) ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test2aDrop (keep = memName name type varNum length) ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test2b memtype = data ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test2bDrop memtype = data ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test2c noPrint; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test2cDrop noPrint; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test3a (keep = memName name type varNum length) memtype = data ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test3aDrop (keep = memName name type varNum length) memtype = data ; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test3b (keep = memName name type varNum length) noPrint; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test3bDrop (keep = memName name type varNum length) noPrint; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test3c memtype = data noPrint; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test3cDrop memtype = data noPrint; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview out = test4 (keep = memName name type varNum length) memtype = data noPrint; run;
proc contents data = CoreChnk.Core_1_Chunk_1_ForReview (drop = _numeric_) out = test4Drop (keep = memName name type varNum length) memtype = data noPrint; run;
Local submission tests.
test0 real time 0.06 seconds user cpu time 0.03 seconds system cpu time 0.01 seconds memory 1134.00k OS Memory 28972.00k
test1 real time 0.07 seconds user cpu time 0.03 seconds system cpu time 0.04 seconds memory 1446.71k OS Memory 28972.00k
test1Drop real time 0.05 seconds user cpu time 0.00 seconds system cpu time 0.01 seconds memory 1375.40k OS Memory 28972.00k
test2a real time 0.06 seconds user cpu time 0.06 seconds system cpu time 0.00 seconds memory 1344.12k OS Memory 28972.00k
test2aDrop real time 0.04 seconds user cpu time 0.03 seconds system cpu time 0.01 seconds memory 1339.50k OS Memory 28972.00k
test2b real time 1.06 seconds user cpu time 0.17 seconds system cpu time 0.12 seconds memory 1850.34k OS Memory 29232.00k
test2bDrop real time 1.04 seconds user cpu time 0.10 seconds system cpu time 0.21 seconds memory 1790.12k OS Memory 29232.00k
test2c real time 0.01 seconds user cpu time 0.00 seconds system cpu time 0.01 seconds memory 1253.87k OS Memory 28972.00k
test2cDrop real time 0.01 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 1183.03k OS Memory 28972.00k
test3a real time 1.04 seconds user cpu time 0.07 seconds system cpu time 0.21 seconds memory 1740.50k OS Memory 29232.00k
test3aDrop real time 1.03 seconds user cpu time 0.04 seconds system cpu time 0.18 seconds memory 1740.31k OS Memory 29232.00k
test3b real time 0.01 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 1145.43k OS Memory 28972.00k
test3bDrop real time 0.01 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 1147.31k OS Memory 28972.00k
test3c real time 1.02 seconds user cpu time 0.00 seconds system cpu time 0.12 seconds memory 1664.15k OS Memory 29232.00k
test3cDrop real time 1.04 seconds user cpu time 0.07 seconds system cpu time 0.12 seconds memory 1593.34k OS Memory 29232.00k
test4 real time 1.07 seconds user cpu time 0.06 seconds system cpu time 0.12 seconds memory 1542.18k OS Memory 29232.00k
test4Drop real time 1.04 seconds user cpu time 0.06 seconds system cpu time 0.12 seconds memory 1507.18k OS Memory 29232.00k
Remote submission 1 of the 20-remote-submission run.
test0 real time 0.59 seconds user cpu time 0.03 seconds system cpu time 0.00 seconds memory 11439.18k OS Memory 21028.00k
test1 real time 0.08 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 9563.59k OS Memory 20772.00k
test1Drop real time 0.11 seconds user cpu time 0.00 seconds system cpu time 0.01 seconds memory 9554.00k OS Memory 20516.00k
test2a real time 0.07 seconds user cpu time 0.00 seconds system cpu time 0.01 seconds memory 9499.09k OS Memory 20260.00k
test2aDrop real time 0.07 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 9500.78k OS Memory 20260.00k
test2b real time 21.13 seconds user cpu time 0.06 seconds system cpu time 0.14 seconds memory 9966.53k OS Memory 20776.00k
test2bDrop real time 23.15 seconds user cpu time 0.01 seconds system cpu time 0.26 seconds memory 9939.46k OS Memory 20776.00k
test2c real time 0.31 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 9378.31k OS Memory 20516.00k
test2cDrop real time 0.08 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 9378.50k OS Memory 20516.00k
test3a real time 25.20 seconds user cpu time 0.03 seconds system cpu time 0.31 seconds memory 9872.37k OS Memory 20776.00k
test3aDrop real time 24.33 seconds user cpu time 0.04 seconds system cpu time 0.31 seconds memory 9873.18k OS Memory 20776.00k
test3b real time 0.62 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 9326.59k OS Memory 20516.00k
test3bDrop real time 0.08 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 9326.21k OS Memory 20516.00k
test3c real time 27.00 seconds user cpu time 0.06 seconds system cpu time 0.25 seconds memory 9752.56k OS Memory 20776.00k
test3cDrop real time 24.23 seconds user cpu time 0.01 seconds system cpu time 0.25 seconds memory 9788.90k OS Memory 20776.00k
test4 real time 22.64 seconds user cpu time 0.00 seconds system cpu time 0.31 seconds memory 9687.06k OS Memory 20776.00k
test4Drop real time 21.89 seconds user cpu time 0.03 seconds system cpu time 0.21 seconds memory 9686.68k OS Memory 20776.00k
Looks like you've got a solution. I suggest you mark your answer as correct / accepted.
I'm surprised that memtype=data would slow things down that much, but then I've never used that option on PROC CONTENTS. Looks like it tells SAS to pull some metadata from every dataset that exists in the library.
There is no reason to include that option when you have asked for a specific member. You cannot have both a VIEW and a DATASET in one library that have the same member name.
The purpose of MEMTYPE= option would be to limit which members it checks when you use _ALL_ as the member name.
So you've got SAS 9 running on your local PC, and you're using RSUBMIT to submit chunks of code to different cores on your local PC? So there's no network or remote server involved at all?
Can you post a little example where you run PROC CONTENTS in your main session and then run it again in a child session and see the difference in execution times?
I haven't used RSUBMIT to spawn local sessions before. Is the code something like:
options autosignon sascmd="C:\Program Files\SASHome\SASFoundation\9.4\sas.exe" ;
rsubmit foo wait=no;
options fullstimer ;
data foo ;
array x{300} ;
run ;
proc contents data=foo ;
run ;
endrsubmit ;
signoff ;
?
When I run that, the PROC CONTENTS step completes in .04 seconds:
1 options autosignon sascmd="C:\Program Files\SASHome\SASFoundation\9.4\sas.exe" ;
2 rsubmit foo wait=no;
NOTE: Remote signon to FOO commencing (SAS Release 9.04.01M7P080520).
NOTE: Remote signon to FOO complete.
NOTE: Background remote submit to FOO in progress.
3 signoff ;
NOTE: Remote submit to FOO commencing.
1 options fullstimer ;
2 data foo ;
3 array x{300} ;
4 run ;
NOTE: The data set WORK.FOO has 1 observations and 300 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 1804.71k
OS Memory 8660.00k
Timestamp 11/10/2025 05:06:49 PM
Step Count 1 Switch Count 0
5 proc contents data=foo ;
6 run ;
NOTE: Non-portable document will be produced. The current settings of FORMCHAR use nonstandard line-drawing characters and the
resulting output file will not render correctly unless all readers of the document have the SAS Monospace font installed.
To make your document portable, issue the following command:
OPTIONS FORMCHAR="|----|+|---+=|-/\<>*";
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.04 seconds
user cpu time 0.00 seconds
system cpu time 0.00 seconds
memory 3294.53k
OS Memory 10964.00k
Timestamp 11/10/2025 05:06:49 PM
Step Count 2 Switch Count 1
NOTE: The PROCEDURE CONTENTS printed pages 1-6.
NOTE: Remote submit to FOO complete.
NOTE: Remote signoff from FOO commencing.
NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414
NOTE: The SAS System used:
real time 0.94 seconds
user cpu time 0.01 seconds
system cpu time 0.01 seconds
memory 10381.56k
OS Memory 10964.00k
Timestamp 11/10/2025 05:06:49 PM
Step Count 2 Switch Count 45
NOTE: Remote signoff from FOO complete.
I'm still struggling to imagine how PROC CONTENTS could take a full second to execute, since all it does is read dataset metadata. Is it just PROC CONTENTS that runs surprisingly slow, or do other PROCS run slow? How about DATA steps?
I wouldn't switch to dictionary tables. In my experience PROC CONTENTS is usually faster than dictionary tables.
Yes, you have the syntax. Each new locally "remote" SAS session winds up using a different CPU core on your computer. What I was working on probably would have taken a few days to run. Using all 20 of my cores, it took ~5 hours, which isn't 1/20th of the time, but still a lot faster. It can now run over night as opposed to all weekend. You have to split your dataset up into chunks so that each core can run on a different part of the dataset. Of course, you can't have code that requires a copy of the full dataset (e.g. no retain, lag(), etc.).
Here's my code. I have 20 cores and 23 chunks per core. That way, in case my computer crashes, I will have at least some of it complete and saved permanently already. (The comments are for me; you probably already know the stuff they say.)
* Split text data up into separate datasets so that multiple SAS sessions can be employed to increase CPU usages. Otherwise, SAS only uses one core out of 20, and this would take a long time to run. *;;
* While we can only use 20 cores at once, we can still break up the data into smaller chunks and run them sequentially on each core, while the sequential series of chunks are run in parallel. This *;
* way, if something happens to the process, we will have some results and will not need to start from the beginning. *;
* Determine the rows in the input data. This only works if saved as a SAS dataset. It does not work on the zipped view. *;;
data _null_;
* This set statement never executes, so no records are read, but the header information about the set dataset is read in. *;
if 0
then set _3_NonNullWithLetters nObs = n;
call symputx('nRows',n);
stop; * Stops infinite loop from never reaching the end of a file. *;
run; * nRows = 45980013 *;
%put nRows = &nRows;
%let nObsPerChunk = 100000;
%let nChunks = %sysFunc(ceil(&nRows/&nObsPerChunk));
%let nCores = 20;
%let nChunksPerCore = %sysFunc(ceil(&nChunks/&nCores));
%put nObs = &nRows nObsPerChunk = *&nObsPerChunk* nChunks = &nChunks nCores = *&nCores* nChunksPerCore = &nChunksPerCore;
* Create the chunks! Split the data into separate datasets so that each core can process difference data simultaneously. *;;
%macro chunks;
data %do chunkA = 1 %to &nChunksPerCore;
%do coreA = 1 %to &nCores;
CoreChnk.Core_&coreA._Chunk_&chunkA (compress = binary)
%end;
%end;;
set Michael._3_NonNullWithLetters;
* We want to fill up the first chunk of each core first, then the second, etc. so that they all have as equal number of chunks as possible as *;
* opposed to the first cores-1 having the total number of chunks, and the last core having maybe as few as 1. *;
select;
%do chunkB = 1 %to &nChunksPerCore;
%do coreB = 1 %to &nCores;
%let chunkIndex = %sysEvalF(&nCores*(&chunkB - 1) + &coreB);
when (%sysEvalF(&nObsPerChunk*(&chunkIndex - 1)) < _n_ <= %sysEvalF(&nObsPerChunk*&chunkIndex)) output CoreChnk.Core_&coreB._Chunk_&chunkB;
%end;
%end;
end;
run;
* Fix capitalization in dataset names and set to read-only. *;
options noXWait noXSync;
%local xString;
%do coreC = 1 %to &nCores; %let xString = cd "%sysFunc(pathName(CoreChnk))\"; %do chunkC = 1 %to &nChunksPerCore; %let xString = %sysFunc(compBl(&xString & attrib -R "Core_&coreC._Chunk_&chunkC..sas7bdat" )); %end; %sysExec &xString; %end; %sleep(20);
%do coreC = 1 %to &nCores; %let xString = cd "%sysFunc(pathName(CoreChnk))\"; %do chunkC = 1 %to &nChunksPerCore; %let xString = %sysFunc(compBl(&xString & rename "Core_&coreC._Chunk_&chunkC..sas7bdat" "Core_&coreC._Chunk_&chunkC..sas7bdat")); %end; %sysExec &xString; %end; %sleep(10);
%do coreC = 1 %to &nCores; %let xString = cd "%sysFunc(pathName(CoreChnk))\"; %do chunkC = 1 %to &nChunksPerCore; %let xString = %sysFunc(compBl(&xString & attrib +R "Core_&coreC._Chunk_&chunkC..sas7bdat")); %end; %sysExec &xString; %end;
options noXWait xSync;
%mEnd chunks;
options mPrint;
%chunks;
options noMPrint;
* Macro to run all the cores and chunks. *;
%macro coreChunk;
options sasCmd = "sas";
* Current datetime. *;
%let ___startDT = %sysFunc(datetime());
* Delete the results from previous runs. *;
proc datasets library = CoreChnk noList;
delete %do core = 1 %to &nCores;
%do chunk = 1 %to &nChunksPerCore;
Core_&core._Chunk_&chunk._ForReview
Core_&core._Chunk_&chunk._NoReview
Core_&core._Chunk_&chunk._ForReviewDetail
%end;
%end;
;
quit;
%do core = 1 %to &nCores;
signOn core&core;
* Pass macro variables to the remote session. *;
%sysLPut core = &core / remote = core&core;
%sysLPut nChunksPerCore = &nChunksPerCore / remote = core&core;
* Pass the libName to the remote session, and submit code to the remote session. *;
rSubmit core&core wait = no inheritLib = (CoreChnk);
* Redirect the log to an external file *;
proc printTo log = "K:\DAplay\Michael\ECS\MGUS Text Search\Core Chunk/Core_&core..log" new;
run;
options fullSTimer;
* Must define the macros within the remote session. *;
%include 'K:\Michael\Search all text fields everywhere - Match Macro.sas' / lRecL = 4096;
%include 'K:\Michael\Search all text fields everywhere - Chunk Macro.sas' / lRecL = 4096;
%include 'K:\Support Macros\squeeze_1.sas' / lRecL = 4096;
%chunk;
proc printTo log = log;
run;
/* %sysRPut ___startDT_&core = &___startDT;*/
endRSubmit;
%end;
waitfor _all_;
signoff _all_;
* Print total duration. *;
data _null_;
dur = datetime() - &___startDT;
put 30*'-' / ' TOTAL DURATION:' dur time13.2 / 30*'-';
run;
%mEnd coreChunk;
%coreChunk;
It's not SAS that runs out of memory, it's the system. Which then pages out momentarily unused processes to disk, and that is noticeably slower on Windows than on other systems (Windows uses a file in the filesystem, UNIX a fixed area on a raw device). And these processes need to be reloaded from disk as soon as they need to become active again. So you never want this to happen, as it always results in a massive performance penalty.
Also, do not consider the reported number of "cores" as the baseline for setting up multi-processing. With hyperthreading-capable CPUs you get more virtual cores, but these need to be utilized by software designed for those. You need to consider number of threads, not number of processes.
Hi @Kastchei
Regardless of the size of your data set, most of the information provided by Proc Contents are already stored in the SAS supplied data dictionary tables/view!
Proc contents displays the metadata (data dictionary) of a SAS dataset, including variable names, types, lengths, and attributes like labels and formats. It is also used to view the structure of a library or a specific table, showing information such as the number of observations, variables, and when the dataset was created. This makes it a valuable tool for understanding a dataset, verifying that data was imported correctly, and performing more intelligent data processing.
Guess what, as soon as you assign a libname that contains your SAS data set(s), SAS behind the scenes gathers all the metadata about your SAS data sets.
Check this paper Exploring DICTIONARY Tables and Views for more details.
Note: Running proc contents does not require splitting your data set, regardless of how big it is.
Hope this helps
Also keep in mind that all that PROC CONTENTS neeeds to do is reading the first page of the dataset, where all metadata is stored. And the ressulting dataset will also rarely be larger than one page (unless you try to compress it, which is counterproductive as can be seen in your log). So anything beyond 0.1 seconds CPU and real time points to some issue with your environment, or congestion on your server.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.