BookmarkSubscribeRSS Feed
Kastchei
Pyrite | Level 9

Hello!

 

I am using RSubmit to utilize all 20 of the CPU cores on my local machine.  I have a very large dataset, and I have broken it up into small chunks to run on each of the cores.  For a test, I'm just running the first 100 observations (obs = 100) from each chunk.  The data steps run quickly, ~1-4 seconds, across my 20 cores.

 

After the datastep is run, I run 6 proc contents.  These procs are run on datasets with a lot of variables (~300) but not a lot of rows (~50-100) in my test.

 

When I run these proc contents locally, not part of the remote sessions, they run in a real time of ~1 seconds.  When I run these proc contents procedures in the remote sessions using RSubmit, they take about anywhere from 5-35 seconds real time.  Does anyone know of a reason why proc contents would be taking so long when submitted remotely?  I couldn't find any information (besides AI) about multi-threaded support in Proc Contents, so I am assuming the Proc Contents is not inherently multi-threaded; but if so, I'm not sure why another core running stuff would affect the run time of Proc Contents.  It's not a big deal for this code.  I can easily just wait to run them locally.  But it is perplexing why it's taking so long.

 

Here is the log for the first two Proc Contents run by Core 1.  The first proc contents is only 6 seconds.  I think this is because not all of the other remote sessions have been fired up, only some have started.  The second proc contents, however, is 24 seconds.  Core 2 is similar.

 

NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           6.15 seconds
      cpu time            0.10 seconds
      
NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           24.25 seconds
      cpu time            0.29 seconds

With Core 3, that first Proc Contents jumps up to 19 seconds.

 

NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           19.10 seconds
      cpu time            0.21 seconds
      
NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           21.97 seconds
      cpu time            0.29 seconds

The rest of the cores stay up around these times, getting a little longer each core.  Here's Core 12.

NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           24.16 seconds
      cpu time            0.29 seconds
      
NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           25.49 seconds
      cpu time            0.17 seconds

And then the last Core 20.

NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           21.71 seconds
      cpu time            0.18 seconds
      
NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           29.04 seconds
      cpu time            0.17 seconds

There is nothing substantially different about the different Cores.  They are all just running subsets of the same original dataset.  Some fluctuations seem normal among the Cores.  The puzzling part is the 10-30 fold jump in real time from local submission to RSubmit "remote" submission on the same machine.

 

For reference, here is a local example of the same proc contents that was run in Core 1 and then Core 20.  They run in less than a second.  This also gives the code.  You can see that both procs within a core are basically the same.  If anything, the second of the pair should run faster, since it drops numeric variables.

 

Core 1 equivalent locally.

32317       proc contents data = corechnk.core_1_chunk_1_forreview
32318                     out  = ___varsAll       (keep = memName name type varNum length) memtype = data noPrint;
32319       run;

NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent.
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.92 seconds
      user cpu time       0.00 seconds
      system cpu time     0.12 seconds
      memory              1558.12k
      OS Memory           48296.00k
      Timestamp           11/07/2025 01:49:12 PM
      Step Count                        2513  Switch Count  0


32320       proc contents data = corechnk.core_1_chunk_1_forreview           (drop = _numeric_)
32321                     out  = ___varsToSqueeze (keep = memName name type varNum length) memtype = data noPrint;
32322       run;

NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent.
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.89 seconds
      user cpu time       0.04 seconds
      system cpu time     0.18 seconds
      memory              1559.25k
      OS Memory           48296.00k
      Timestamp           11/07/2025 01:49:13 PM
      Step Count                        2514  Switch Count  0

Core 20 equivalent locally.

32323       proc contents data = corechnk.core_20_chunk_1_forreview
32324                     out  = ___varsAll       (keep = memName name type varNum length) memtype = data noPrint;
32325       run;

NOTE: The data set WORK.___VARSALL has 349 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSALL increased size by 100.00 percent.
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.88 seconds
      user cpu time       0.03 seconds
      system cpu time     0.14 seconds
      memory              1284.25k
      OS Memory           47980.00k
      Timestamp           11/07/2025 01:49:13 PM
      Step Count                        2515  Switch Count  0


32326       proc contents data = corechnk.core_20_chunk_1_forreview           (drop = _numeric_)
32327                     out  = ___varsToSqueeze (keep = memName name type varNum length) memtype = data noPrint;
32328       run;

NOTE: The data set WORK.___VARSTOSQUEEZE has 224 observations and 5 variables.
NOTE: Compressing data set WORK.___VARSTOSQUEEZE increased size by 100.00 percent.
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: PROCEDURE CONTENTS used (Total process time):
      real time           0.90 seconds
      user cpu time       0.01 seconds
      system cpu time     0.18 seconds
      memory              1250.84k
      OS Memory           47980.00k
      Timestamp           11/07/2025 01:49:14 PM
      Step Count                        2516  Switch Count  0

 

1 REPLY 1
Kurt_Bremser
Super User

There's something seriously amiss with your SAS setup/computer.

Just for comparison:

 69         proc contents data=sashelp.baseball out=ba;
 70         run;
 
 NOTE: The data set WORK.BA has 24 observations and 41 variables.
 NOTE:  Verwendet wurde: PROZEDUR CONTENTS - (Gesamtverarbeitungszeit):
       real time           0.03 seconds
       user cpu time       0.03 seconds
       system cpu time     0.00 seconds

This is from SAS On Demand.

As you can see, PROC CONTENTS takes nearly no time at all, while even under "ideal" conditions it almost takes a full second (real time) on your site, much longer that the CPU time, so you have a lot of wait states there.

What computer/server do you run SAS on?

I suspect that your additional SAS processes cause paging on your computer, and with every context switch the whole process needs to be reloaded from disk cache, which would explain the excessive delay.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 109 views
  • 0 likes
  • 2 in conversation