BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
alepage
Barite | Level 11

Hello,

 

I would like to test for the benefits using the parallel processing mostly in term of process time.

It is my first steps toward the parallel processing.

 

First some details about my windows environment are saved in the file Windows1.docx

The version of SAS I am using as well as the procedure available are included in the two text files.

 

My test of parallel processing is base on the following paper:

https://analytics.ncsu.edu/sesug/2013/PA-08.pdf

 

The first program generate the synthetic data.

The second program carry out many statistical tests in series.

 

However, the following program (the parallel processing) is not working properly.

As I have absolutely non experience with this approach, I am not able to make the trouble shooting.

 

Moreover, there are many command for which I don't understand how to use them as well as their purpose.

The log file is included.  Sorry some lines are in French.

 

 

Here's the parallel processing program, I would like to test.  I have included questions as comments.

 

libname ip "...\Test\Parallel Processing\Data";
libname op "...\Test\Parallel Processing\Statistical Data";

 

%macro parallel_process(ilib=,idsn=,olib=op,odsn=);

 

options fullstimer autosignon=yes sascmd="sas92 -nonews -threads";

 

/*What's the purpose of the instruction autosignon=yes? */

/*What are the purposes of the sas commands sascmd= SAS92?  -nonews? and -threads?*/

 

 

/*I have add few macro variables to help to troubleshooting the program inside the macro parallel processing */

%let ilib=ip;

%let idsn=random_data_2_500000;

%let olib=op;

%let odsn=pp_uni_2v_500000o;

 

 

%global num_vars thread ;

 

 

%let num_vars=%sysfunc(attrn(%sysfunc(open(&ilib..&idsn.,i)),nvars));

 

/*In the synthetic data sets, there are 21 variables (ID, character variable, and all others are numerical)*/

 

%do thread = 1 %to (&num_vars-1);

 

/*What's the purpose of signon task&thread wait=yes ?*/

signon task&thread. wait=yes;
%syslput thread = &thread;

 

/*Does the 4 macro variables below will change during the looping?  If not, is it necessary to put those in the loop and if so why ?*/


%syslput ilib = &ilib;
%syslput idsn = &idsn;
%syslput olib = &olib;
%syslput odsn = &odsn;

 

/*I am not sure that SAS EG 7.1 does understand the instructions below*/

/*Which instruction can we use in replacement of rsubmit process=task&Thread. wait=no sysrputsync=yes ?*/

/*What's the purpose of sysrputsync = yes ?*/

 

rsubmit process=task&thread. wait=no sysrputsync=yes;
 

/*Do we have to assign two libraries in the loop while those have been declared outside the loop ? and if so why?*/


libname ip "...\Test\Parallel Processing\Data";
libname op "...\Test\Parallel Processing\Statistical Data";

 

/*The same options used outside the loop and inside the loop...Why do we need to repeat those?*/

 

options fullstimer autosignon=yes sascmd="sas92 -nonews -threads";

 

%macro univ_parallel;
proc univariate data=&ilib..&idsn. noprint;

 var var_&thread.;
 output out=&olib..&odsn._&thread.

 /* Descriptive Statistics */
 CSS=CSS CV=CV KURTOSIS=KURTOSIS MAX=MAX MEAN=MEAN
 MIN=MIN MODE=MODE N=N NMISS=NMISS NOBS=NOBS RANGE=RANGE
 SKEWNESS=SKEWNESS STD=STD STDMEAN=STDMEAN SUM=SUM
 SUMWGT=SUMWGT USS=USS VAR=VAR

 /* Quantile Statistics */

 P1=P1 P5=P5 P10=P10 Q1=Q1 MEDIAN=MEDIAN Q3=Q3
 P90=P90 P95=P95 P99=P99 QRANGE=QRANGE

 /* Robust Statistics */

 GINI=GINI MAD=MAD QN=QN SN=SN STD_GINI=STD_GINI
 STD_MAD=STD_MAD STD_QN=STD_QN STD_QRANGE=STD_QRANGE STD_SN=STD_SN

 /* Hypothesis Testing Statistics */

 MSIGN=MSIGN NORMALTEST=NORMALTEST SIGNRANK=SIGNRANK
 PROBM=PROBM PROBN=PROBN PROBS=PROBS PROBT=PROBT ;
 run;
 %mend univ_parallel;
 %univ_parallel;

 

/*Again, SAS EG 7.1 does not understand the instruction endrsubmit.  Which instruction can we use in replacement?*/

 

 endrsubmit;


 %end  /*thread = 1 %to (&num_vars-1)*/;

 

/*The instruction task&Thread. is in red which tell me that there is something's wrong?*/

/*Does this loop is correct and what's the purpose of calling task&thread.?*/

waitfor _all_ %do thread = 1 %to (&num_vars-1);
 task&thread;
%end;
 

/*Houf!  last questions.  What's the purpose of rget and signoff?**/


%do thread = 1 %to (&num_vars-1);
 rget task&thread;
%end;

%do thread = 1 %to (&num_vars-1);
 signoff task&thread;
%end;
%mend parallel_process;
%parallel_process(ilib=ip,idsn=random_data_2_500000, olib=op,odsn=pp_uni_2v_500000o);

 

Thanks in advance for your help.

alepage

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
alepage
Barite | Level 11

Hello,

 

Could you please explain me why we need to execute the procedure below

 

proc download data = sedan;

run;

 

even if we have executed those instructions before in order to see the dataset? :

 

data sedan;

set sashelp.cars;

where Type="Sedan";

run;

View solution in original post

13 REPLIES 13
tomrvincent
Rhodochrosite | Level 12
It's a lot easier to do parallel in EG than that paper, which isn't for EG.

In EG, do this:
1. turn grid processing on in the project properties.
2. in options, put some code in 'insert custom code before submitted code' that sets your libnames and global variables (I write mine out to files in autoexec and then read them in).
3. link the tasks (based on logic creation order) within a single flow. You can have multiple link paths so long as they can run concurrent.
4. run the flow and debug.
alepage
Barite | Level 11

Hello,

 

As I have mentioned, I am at my first steps.  Could you give me an example with a program not a project.

I need to compare the performance between using one processor  and multiple processor.

 

Then I need to use the complicated approach because I will transfer my code on a Unix server.

 

Then if it works, I need to copy my code on a Unix server thereafter use a KSH file to execute the code.

 

Could you please give me more information with a program instead of a project.  Also, do I need to put my code in a special place ?  Anything particular regarding the libname, macro variable an so on.

 

Thanks in advance for your help

alepage

 

tomrvincent
Rhodochrosite | Level 12
I thought you wanted to use EG. KSH isn't relevant within EG.
alepage
Barite | Level 11

OK.

Let's start it over.

 

Nobody could figure out how difficult it is to work here.  Last week, I was ask to look at the parallel processing approach.

I never did that before.

 

Today, I would like to test it first from Enterprise Guide and mostly, understand what I am doing.

I have found a sample example on the web.  I could start from there.

 

After that, if it works, I would try with the code I have put on the upper page.

 

Then if it works, I would develop a code to work on a Unix server.

 

Here, we have many servers some are windows based servers others are Unix based servers. 

 

My task is first on a windows server, test if parallel processing works and if it improve the time process.

I really need to understand what I am doing in relation with the architecture of the server.

 

 

I need to be able to interrogate each server to know how many cpu it have, available space, RAM and so on.

 

At the end, I need a solution for the windows server and another one for the Unix server.

So when they make up their minds, I will be ready.

 

Here's a sample example I found on the web .  It works almost well but the data set suv and sedan are never created.

Why?

 

From the properties of the program, I have selected:

 

a) authorize parallel execution from the same server

b) use grid computing.

 

Do I need to select both option?  Which one is better?

Here's a sample example:

 

%let rc = %sysfunc( grdsvc_enable(_all_, server= SASApp));

%put &rc;

signon grid1;

signon grid2;

proc datasets library=work noprint;

delete sedan SUV;

run;

rsubmit grid1 wait=no ;

data sedan;

set sashelp.cars;

where Type="Sedan";

run;

endrsubmit;

rsubmit grid2 wait=no ;

data SUV;

set sashelp.cars;

where Type="SUV";

run;

endrsubmit;

waitfor _ALL_ grid1 grid2;

 

 

proc print data=sedan;

run;

proc print data=SUV;

run;

 

I believe that we are licensed for the grid mode.(see the log file included)

 

If I select only authorize parallel processing on the same computer, How could I rewrite the code above (suv and sedan example) to make it clear that I am using the parallel processing and compare the process time with the standard (non parallel processing) mode.

 

 

 

 

 

 

SASKiwi
PROC Star

Do you have SAS/CONNECT both installed and licensed on your SAS servers? SIGNON will only work if you have this product. You can use proc setinit and proc product_status to confirm if you have SAS/CONNECT.

 

If you have SAS Grid environments then you should have SAS/CONNECT.

alepage
Barite | Level 11

Here's the proc available

 

 

UC A : Nom du modèle='' numéro du modèle='' série='+12'.

Expiration : 14JUN2019.

Délai de grâce : 30 jours (fin le 14JUL2019).

Délai d'avertissement : 30 jours (fin le 13AUG2019).

Anniversaire du système : 23JUL2018.

Système d'exploitation : WX64_SV .

Dates d'expiration du produit :

---Logiciel Base SAS 14JUN2019 (CPU A)

---SAS/STAT 14JUN2019 (CPU A)

---SAS/GRAPH 14JUN2019 (CPU A)

---SAS/ETS 14JUN2019 (CPU A)

---SAS/FSP 14JUN2019 (CPU A)

---SAS/OR 14JUN2019 (CPU A)

---SAS/AF 14JUN2019 (CPU A)

---SAS/IML 14JUN2019 (CPU A)

---SAS/QC 14JUN2019 (CPU A)

---SAS/SHARE 14JUN2019 (CPU A)

---SAS/ASSIST 14JUN2019 (CPU A)

---SAS/CONNECT 14JUN2019 (CPU A)

---SAS/TOOLKIT 14JUN2019 (CPU A)

2 Le Système SAS 10:24 Monday, October 29, 2018

---SAS/EIS 14JUN2019 (CPU A)

---SAS/GIS 14JUN2019 (CPU A)

---SAS/SHARE*NET 14JUN2019 (CPU A)

---MDDB Server common products 14JUN2019 (CPU A)

---SAS Integration Technologies 14JUN2019 (CPU A)

---SAS/Secure 168-bit 14JUN2019 (CPU A)

---SAS/Secure Windows 14JUN2019 (CPU A)

---SAS Enterprise Guide 14JUN2019 (CPU A)

---OR OPT 14JUN2019 (CPU A)

---OR PRS 14JUN2019 (CPU A)

---OR IVS 14JUN2019 (CPU A)

---OR LSO 14JUN2019 (CPU A)

---SAS/ACCESS Interface to Oracle 14JUN2019 (CPU A)

---SAS/ACCESS Interface to Sybase 14JUN2019 (CPU A)

---SAS/ACCESS Interface to PC Files 14JUN2019 (CPU A)

---SAS/ACCESS Interface to ODBC 14JUN2019 (CPU A)

---SAS/IML Studio 14JUN2019 (CPU A)

---SAS Workspace Server for Local Access 14JUN2019 (CPU A)

---SAS Workspace Server for Enterprise Access 14JUN2019 (CPU A)

---High Performance Suite 14JUN2019 (CPU A)

---SAS Add-in for Microsoft Excel 14JUN2019 (CPU A)

---SAS Add-in for Microsoft Outlook 14JUN2019 (CPU A)

---SAS Add-in for Microsoft PowerPoint 14JUN2019 (CPU A)

---SAS Add-in for Microsoft Word 14JUN2019 (CPU A)

SASKiwi
PROC Star

Adding DOWNLOAD steps will fix your example program:

 

signon grid1;
signon grid2;
proc datasets library=work noprint;
delete sedan SUV;
run;
rsubmit grid1 wait=no ;
data sedan;
set sashelp.cars;
where Type="Sedan";
run;
proc download data = sedan;
run;
endrsubmit;
rsubmit grid2 wait=no ;
data SUV;
set sashelp.cars;
where Type="SUV";
run;
proc download data = SUV;
run;
endrsubmit;
waitfor _ALL_ grid1 grid2;
 
 
proc print data=sedan;
run;
proc print data=SUV;
run;
alepage
Barite | Level 11

Hello,

 

Could you please explain me why we need to execute the procedure below

 

proc download data = sedan;

run;

 

even if we have executed those instructions before in order to see the dataset? :

 

data sedan;

set sashelp.cars;

where Type="Sedan";

run;

SASKiwi
PROC Star

Each time you SIGNON you create a new SAS session with its own WORK library, so to copy from each of the WORK libraries to your main SAS session WORK library you need to DOWNLOAD the datasets.

 

If you want to avoid downloading then write to a permanent SAS library that is accessible by all SAS sessions as described by @tomrvincent.

tomrvincent
Rhodochrosite | Level 12
If you do a libname xxx to some perm location and do xxx.sedan instead of work, you might see sedan. Same with SUV.
alepage
Barite | Level 11

Hello,

 

I did the modifications as suggested but it did not work.

 

%let rc = %sysfunc( grdsvc_enable(_all_, server= SASApp));

libname mylib "\\...\Documents\Test\Parallel Processing\Data3";

signon grid1;

signon grid2;

 

/*

proc datasets library=work noprint;

delete sedan SUV;

run;

*/

rsubmit grid1 wait=no ;

data mylib.sedan;

set sashelp.cars;

where Type="Sedan";

run;

endrsubmit;

rsubmit grid2 wait=no ;

data mylib.SUV;

set sashelp.cars;

where Type="SUV";

run;

endrsubmit;

waitfor _ALL_ grid1 grid2;

title "Test1, using grid computing and mylib";

proc print data=mylib.sedan;

run;

proc print data=mylib.SUV;

run;

 

SASKiwi
PROC Star

Repeat the MYLIB libname in your grid1 and grid2 sessions. Remember each SAS session is separate so the LIBNAME definition must be done in all 3 sessions.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 4667 views
  • 1 like
  • 4 in conversation