SAS expertise delivered to your desktop -- on-demand and free!

Join Now

Gaining Comfort with SAS Grid Computing for Analysts and Data Scientists

by SAS Employee megak8 ‎03-07-2018 05:15 PM - edited ‎04-05-2018 12:12 PM (1,360 Views)

Gaining Comfort with SAS Grid Computing for Analysts and Data Scientists video, slides, examples plus questions and answers are now available.  

Watch the webinar

This session describes what SAS Grid Computing is and how can it boost your analytics' performance. It introduces you to the key concepts, benefits and techniques that you can use to gain the most efficiencies from SAS Grid Computing.

 

 grid.png 

 

Here are some highlighted questions and answers that were submitted. Scroll down to review the attached slides.

   

Is the SCAPROC procedure available in SAS Enterprise Guide? ­How we can enable SCAPROC in SAS EG?­

­Yes, you can run it as a procedure or invoke it via the program editor -> analyze -> analyze for Grid.

 

­Processing time - you compared sequential in Base SAS vs parallel in grid. How about apples to apples? Parallel processing in Base SAS (using MP CONNECT, remote submit within single program) vs. parallel processing in SAS Grid Computing?­

The before and after illustrations that were demonstrated were apples to apples; the procedural steps were first run sequentially on a 5 node SAS Grid Computing configuration, where they consumed 9+ minutes of real time. Those same procedural steps were respecified via SCAPROC to contain remote submits to process the procedural steps in parallel on the same a 5 node SAS Grid Computing configuration, where they consumed 2+ minutes of real time.  

 

You can think of SAS Grid Computing as an extension of SAS/CONNECT where the  Grid Control Server automates the processing. Not only can you run in parallel on the same box, you can also split the processing across multiple machines (nodes)­ because the job is submitted through the Grid Control Server. As you will see later in the webinar, you can analyze your SAS programs to determine if you can break it up into sessions, running the various steps in parallel.

 

Can we use syshostname to know which node we are executing upon?­

You can monitor the job in the SAS Management Console and the LSF on-line tools to see which server node the job is running on­Also, you can use: %put %sysget(HOSTNAME); plus two grid macros %jobinfo; and %gjobs;­ - the node will show in the log from signon. You can also use %sysget to set up macro variables., or  %SYSFUNC(GRDSVC_GETNAME(TASK1));­

­

 

Is it possible to extend SAS Grid Computing to other SAS applications like SAS CA & VA/VS?­

Yes - here is a paper on this topic.

 

Can SAS Grid Computing only be utilized on AIX/Unix or can it be utilized on a Windows box? Also, previously running on a non-grid environment in Windows ran faster that our current grid in AIX environment. How can this be or should it be the case?

Yes, Windows and Unix are supported and they can be combined together http://support.sas.com/rnd/scalability/grid/SAS_Grid_FAQ_External_31JUL2007.pdf

 

How can we find when a file is written into metadata?

You can see it in your library and folder lists in interfaces such as SAS Studio, Enterprise Guide, SAS Add-In for Microsoft Office, Enterprise Miner, etc. If you have been granted permission to see it in the SAS Management Console, you’ll also see it there.

­

 

I started using SAS Grid Computing only recently (past 6 months) and before that I always used PC SAS.  I use EG 7.1 and SAS v9.4 and  noticed is simple program processing rows run significantly slower in SAS Grid compared to PC SAS. Any ideas to improve speed?­

Please check with your SAS administration team because they have ways to improve your SAS Grid Computing environment. You may also want to pursue a professional assessment and tune-up service 

 

How to find CPU consumption or memory usage by a program?­

­If you are looking for metrics via job and steps, CPU and memory by server:

%lsload;

Please be aware that this macro is only for SAS Grid Computing.

 

How does SCAPROC work with LINX - Autosys Batch environment?­

SCAPROC is a Base SAS procedure, so it runs the same way regardless of the host operating system(s).

 

What if I have 6 sessions, the 1st 5 are independent, but 6th is dependent on the 1st 5 sessions?

2 options:

1) add a WAITFOR statement in the code - easiest way for analysts to control this operation, especially for testing or ad hoc scenarios.

2) use schedule manager in SAS Management Console to schedule jobs to run with dependencies - a more strategic approach for production operations, will likely involve SAS administrators' participation.

 

I have a date field with value 1.8E9, how do I convert it into MMDDYYYY in SAS?­

Please explore SAS functions such as PUT and INPUT to conduct this conversion. There’s another on-demand Ask the Expert webinar on this topic https://communities.sas.com/t5/Ask-the-Expert/Top-10-SAS-Functions/ta-p/391244

 

­Why is grdsvc_enable( _all_, server=SASApp) not always on?  Is there an advantage to turn it off at end of the program via grdsvc_enable( _all_, '' )?­

It enables or disables one or all SAS sessions on a grid, so you’ll want to consider when and how many sessions to enable.

 

What's the difference between 'analyze for program' and 'analyze for grid'?­

Analyze for Program will analyze your code and create a process flow of each step in the code. Analyze for Grid will assess and add the SCAPROC statements to your code for SAS Grid Computing.

 

If my program starts with two PROC SQL's that are both pass-throughs, will SAS Grid Computing wait for each of these independent steps to complete to continue processing the rest of the code?    ­

When you analyze the SAS code using SCAPROC or submit by SASGSUB, the dependencies will be determined.

 

In the 1st SAS grid example in SAS base, there are several independent procedures (DATA step, PROC FREQ, PROC MEANS, etc.), but there is only one sign on task in the example. Will this single sign on work for all the procedures running in parallel?­

Yes, all those steps will run sequentially in the one grid session.

 

 

 
Comments
by New Contributor JanelleRolph
on ‎03-14-2018 12:58 PM

Thanks for sharing!

Contributors