Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

SAS Code execution is not using the CAS worker nodes

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 19
Accepted Solution

SAS Code execution is not using the CAS worker nodes

 

Hi,

 

I am trying to execute the below SAS code in SAS Viya CAS environment and it took a long time as it is not using the CAS worker nodes. The log shows the session is using 0 worker node.

 

// code 

 


cas MySession sessopts=(caslib=casuser);
libname mycas cas caslib=casuser;

proc casutil;
load data=sashelp.cars replace;
run;

data mycas.bigcars;
set mycas.cars;
do i=1 to 150000;
output;
end;
run;

data mycas.bigcars_score;
set mycas.bigcars;
length myscore 8;
myscore=0.3*Invoice/(MSRP-Invoice)
+0.5*(EngineSize+Horsepower)/Weight + 0.2*(MPG_City+MPG_Highway);
Thread=_threadid_;
run;

 

//log info

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
56
57
58 cas MySession sessopts=(caslib=casuser);
NOTE: The session MYSESSION connected successfully to Cloud Analytic Services rpclab03045.exnet.sas.com using port 5570. The UUID
is cd58cd44-5125-0445-b613-019d12ffb677. The user is viyauser and the active caslib is CASUSER(viyauser).
NOTE: The SAS option SESSREF was updated with the value MYSESSION.
NOTE: The SAS macro _SESSREF_ was updated with the value MYSESSION.
NOTE: The session is using 0 workers.
NOTE: 'CASUSER(viyauser)' is now the active caslib.
NOTE: The CAS statement request to update one or more session options for session MYSESSION completed.
59 libname mycas cas caslib=casuser;
NOTE: Libref MYCAS was successfully assigned as follows:
Engine: CAS
Physical Name: cd58cd44-5125-0445-b613-019d12ffb677
60
61 proc casutil;
NOTE: The UUID 'cd58cd44-5125-0445-b613-019d12ffb677' is connected using session MYSESSION.
62
62 ! load data=sashelp.cars replace;
NOTE: SASHELP.CARS was successfully added to the "CASUSER(viyauser)" caslib as "CARS".
63 run;
 
64
 
NOTE: PROCEDURE CASUTIL used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
 
65 data mycas.bigcars;
 
66 set mycas.cars;
67 do i=1 to 150000;
68 output;
69 end;
70 run;
 
NOTE: Running DATA step in Cloud Analytic Services.
NOTE: The DATA step will run in multiple threads.
NOTE: There were 428 observations read from the table CARS in caslib CASUSER(viyauser).
NOTE: The table bigcars in caslib CASUSER(viyauser) has 64200000 observations and 16 variables.
NOTE: DATA statement used (Total process time):
real time 45.17 seconds
cpu time 0.01 seconds
 
 
71
72 data mycas.bigcars_score;
73 set mycas.bigcars;
74 length myscore 8;
75 myscore=0.3*Invoice/(MSRP-Invoice)
76 +0.5*(EngineSize+Horsepower)/Weight + 0.2*(MPG_City+MPG_Highway);
77 Thread=_threadid_;
78 run;
 
NOTE: Running DATA step in Cloud Analytic Services.
NOTE: The DATA step will run in multiple threads.
NOTE: There were 64200000 observations read from the table BIGCARS in caslib CASUSER(viyauser).
NOTE: The table bigcars_score in caslib CASUSER(viyauser) has 64200000 observations and 18 variables.
NOTE: DATA statement used (Total process time):
real time 23.69 seconds
cpu time 0.00 seconds

Accepted Solutions
Solution
‎02-15-2017 12:14 AM
SAS Employee
Posts: 26

Re: SAS Code execution is not using the CAS worker nodes

Posted in reply to sivaram_veerabagu

Ok - Yes our Early Preview program is set up to provide you with an SMP (symmetric multiprocessing) server meaning everything runs on a single machine (still multi-threaded though).  For larger problems like working with a 64M observation data set you would definitely want to be using an MPP (massively parallel processing) server with worker nodes.  You don't have control over that in the EP program.  If you would like to explore this further I can see if someone can work more closely with you on this.

 

As an example, I just ran your same code on a MPP server with 4 worker nodes.  The times are much better.

 

197 libname mycas cas caslib=casuserhdfs;
NOTE: Libref MYCAS was successfully assigned as follows:
Engine: CAS
Physical Name: cfc4294c-f22f-094b-9ad5-36f9b0950c66
198 proc casutil;
NOTE: The UUID 'cfc4294c-f22f-094b-9ad5-36f9b0950c66' is connected using session MYSESS.
199 load data=sashelp.cars replace;
NOTE: SASHELP.CARS was successfully added to the "CASUSERHDFS(brwuje)" caslib as "CARS".
200 run;


NOTE: PROCEDURE CASUTIL used (Total process time):
real time 0.15 seconds
cpu time 0.01 seconds


201 data mycas.bigcars;
202 set mycas.cars;
203 do i=1 to 150000;
204 output;
205 end;
206 run;

NOTE: Running DATA step in Cloud Analytic Services.
NOTE: The DATA step will run in multiple threads.
NOTE: There were 428 observations read from the table CARS in caslib CASUSERHDFS(brwuje).
NOTE: The table bigcars in caslib CASUSERHDFS(brwuje) has 64200000 observations and 16
variables.
NOTE: DATA statement used (Total process time):
real time 15.67 seconds
cpu time 0.12 seconds


207 data mycas.bigcars_score;
208 set mycas.bigcars;
209 length myscore 8;
210 myscore=0.3*Invoice/(MSRP-Invoice)
211 +0.5*(EngineSize+Horsepower)/Weight + 0.2*(MPG_City+MPG_Highway);
212 Thread=_threadid_;
213 run;

NOTE: Running DATA step in Cloud Analytic Services.
NOTE: The DATA step will run in multiple threads.
NOTE: There were 64200000 observations read from the table BIGCARS in caslib
CASUSERHDFS(brwuje).
NOTE: The table bigcars_score in caslib CASUSERHDFS(brwuje) has 64200000 observations and 18
variables.
NOTE: DATA statement used (Total process time):
real time 10.26 seconds
cpu time 0.03 seconds

View solution in original post


All Replies
SAS Employee
Posts: 26

Re: SAS Code execution is not using the CAS worker nodes

Posted in reply to sivaram_veerabagu

With 64M observations I am not surprised this is taking a long time with this setup - it appears that your CAS server was started in SMP mode...meaning that it all runs on the same machine with no worker nodes.  How was your CAS server started?

 

Just as some back-info here for those that might not be aware...the CAS server establishes the distributed in-memory execution environment that is available to you - you start a CAS session as your own isolated process on that server to govern execution of your own jobs.  The session environment can only be a subset of how the server environment is established.

 

So please provide info on how your CAS server is started.

 

Thanks.

Occasional Contributor
Posts: 19

Re: SAS Code execution is not using the CAS worker nodes

[ Edited ]
Posted in reply to BrettWujek

I am using SAS Viya Early Preview program environment and I am not aware of how the CAS server is started.

 

Thanks.

Solution
‎02-15-2017 12:14 AM
SAS Employee
Posts: 26

Re: SAS Code execution is not using the CAS worker nodes

Posted in reply to sivaram_veerabagu

Ok - Yes our Early Preview program is set up to provide you with an SMP (symmetric multiprocessing) server meaning everything runs on a single machine (still multi-threaded though).  For larger problems like working with a 64M observation data set you would definitely want to be using an MPP (massively parallel processing) server with worker nodes.  You don't have control over that in the EP program.  If you would like to explore this further I can see if someone can work more closely with you on this.

 

As an example, I just ran your same code on a MPP server with 4 worker nodes.  The times are much better.

 

197 libname mycas cas caslib=casuserhdfs;
NOTE: Libref MYCAS was successfully assigned as follows:
Engine: CAS
Physical Name: cfc4294c-f22f-094b-9ad5-36f9b0950c66
198 proc casutil;
NOTE: The UUID 'cfc4294c-f22f-094b-9ad5-36f9b0950c66' is connected using session MYSESS.
199 load data=sashelp.cars replace;
NOTE: SASHELP.CARS was successfully added to the "CASUSERHDFS(brwuje)" caslib as "CARS".
200 run;


NOTE: PROCEDURE CASUTIL used (Total process time):
real time 0.15 seconds
cpu time 0.01 seconds


201 data mycas.bigcars;
202 set mycas.cars;
203 do i=1 to 150000;
204 output;
205 end;
206 run;

NOTE: Running DATA step in Cloud Analytic Services.
NOTE: The DATA step will run in multiple threads.
NOTE: There were 428 observations read from the table CARS in caslib CASUSERHDFS(brwuje).
NOTE: The table bigcars in caslib CASUSERHDFS(brwuje) has 64200000 observations and 16
variables.
NOTE: DATA statement used (Total process time):
real time 15.67 seconds
cpu time 0.12 seconds


207 data mycas.bigcars_score;
208 set mycas.bigcars;
209 length myscore 8;
210 myscore=0.3*Invoice/(MSRP-Invoice)
211 +0.5*(EngineSize+Horsepower)/Weight + 0.2*(MPG_City+MPG_Highway);
212 Thread=_threadid_;
213 run;

NOTE: Running DATA step in Cloud Analytic Services.
NOTE: The DATA step will run in multiple threads.
NOTE: There were 64200000 observations read from the table BIGCARS in caslib
CASUSERHDFS(brwuje).
NOTE: The table bigcars_score in caslib CASUSERHDFS(brwuje) has 64200000 observations and 18
variables.
NOTE: DATA statement used (Total process time):
real time 10.26 seconds
cpu time 0.03 seconds

Occasional Contributor
Posts: 19

Re: SAS Code execution is not using the CAS worker nodes

Posted in reply to BrettWujek

Thanks for your explanation and the logs from MPP server.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 245 views
  • 2 likes
  • 2 in conversation