<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Parallel processing in EG isn't working as I expected... in SAS Enterprise Guide</title>
    <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Parallel-processing-in-EG-isn-t-working-as-I-expected/m-p/417634#M26892</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/4"&gt;@ChrisHemedinger&lt;/a&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;&lt;SPAN&gt;EG has the ability to manage multiple connections to a single "logical" SAS server (example, "SASApp").&amp;nbsp; It does this behind the scenes with the built-in Data Explorer feature, which submits multiple jobs to gather descriptive stats for a single data set.&lt;/SPAN&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I'm not quite sure what this means?&amp;nbsp; Would multiple connections to a single logical SAS server run synchronously or asynchronously (parallel)?&amp;nbsp; In any case, for my processing, I'm not interested in descriptive stats for a single data set, unless those descriptive stats add intelligence so that downstream EG tasks cause those tasks to run in the correct environment.&lt;/SPAN&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;&lt;SPAN&gt;And then it also does this a bit more explicitly with the "parallel processing" flag for SAS programs.&lt;/SPAN&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;This appears to do what I want.&amp;nbsp; When I run multiple programs in parallel, it's clear from the "green boxes" and the start/end times that the code runs asynchronously.&amp;nbsp; I assume this is spawning multiple SAS processes behind the scenes, analogous to MP_CONNECT but without all the explicit setup and process management (eg. explict WAITFOR statements).&amp;nbsp; (So it's good that EG makes this easier than setting up connect scripts (which I don't control anyway), etc.)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;&lt;SPAN&gt;Your jobs would need to be created to work with a common shared work library -- different than the built-in WORK -- to share the transient data generated among the different programs.&lt;/SPAN&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;In my "real code", this is the case; the code that runs in parallel is doing explicit passthrough to SQL Server, with both source and target tables residing in SQL Server.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;My issue is that:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;1) For a program entry that has overridden the project default of parallel processing,&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;2) So it runs in the "current" workspace server,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;3) And creates a work dataset in the current workspace server, then&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;4) &lt;STRONG&gt;I would expect a task based on that dataset, such as Filter and Sort, to be smart enough to run in the current workspace server, instead of the project default parallel processing.&amp;nbsp; IMO that would be the user friendly thing to do; be smart enough to know that the work dataset is on the current server, so run downstream code that uses that work dataset as a source table on the current server.&amp;nbsp;&amp;nbsp;Of course, others may disagree.&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;&lt;SPAN&gt;That's why SAS Enterprise Guide is also designed to work well with SAS Grid Computing, ...&amp;nbsp;&amp;nbsp;If you're looking for efficiency, I think that's the better option.&lt;/SPAN&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;We don't have SAS Grid Computing installed.&amp;nbsp; I assume SAS Grid Computing significantly increases license costs?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Do those SAS Grid Computing macros work even if we only have a single SAS server, i.e. can I use them to programmatically execute program entries in parallel?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;&lt;SPAN&gt;If you're looking for simplicity -- but with parallel processing -- then you'll need to probably limit your scenarios to those that don't share WORK data and don't have too many parallel branches.&lt;/SPAN&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Here is the use case that I wish EG (easily) supported.&amp;nbsp; It's based around common ETL practices, of which I'm sure SAS R&amp;amp;D are aware.&amp;nbsp; For those organizations that are too cheap to have a proper scheduler, and want to use EG for this sort of processing (even if just development), this could be a useful workaround.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Project Properties:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;By default NOT setup for parallel processing.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Process Flow:&amp;nbsp; "Setup"&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Set SAS options, SASAUTOS setting, allocate libraries, etc.&amp;nbsp; Whatever global setup you need for your processing.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Process Flow:&amp;nbsp; "Extract"&lt;BR /&gt;Since there are no dependencies on the extract processing, I want this entire process flow to run in parallel.&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;So I RMB the Process Flow properties (which IMO are currently pretty useless), and select the mythical "Run this &lt;STRONG&gt;process flow&lt;/STRONG&gt; in parallel" option.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I do my delta extract (say any rows changed in the last week) from source to staging tables.&amp;nbsp; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;In my RDBMS environment, this would be explicit passthrough to the RDBMS, so SAS is just a client submitting code.&amp;nbsp; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;In a SAS environment, this would extract SAS data from SAS source datasets to SAS staging datasets.&amp;nbsp; They would have to be permanent SAS datasets since the code is running in parallel.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I wish EG could "inject" one or more program entries and/or the process flow (i.e. "Setup") into the parallel jobs.&amp;nbsp; A bit analogous to the Autoexec process flow option, but for parallel processing.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Alternatively, use the Link functionality to have a setup program that links to all the downstream parallel program.&amp;nbsp; So you'd need the ability to have multiple links downstream from the setup program.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;A workaround is to save the "Setup" code into one or more external files, and %include those files in the parallel extract programs.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Process Flow:&amp;nbsp; "Transform"&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;This will have process order dependencies, so I need this code to run synchronously.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;For example, I first prepare all the dimension tables, then prepare the fact tables with the surrogate keys / foreign keys from the dim tables.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Process Flow:&amp;nbsp; "Load"&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;There are no dependencies between tables, so I want to run these programs in parallel.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I do a similar setup as per the "Extract" process flow.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 01 Dec 2017 01:19:48 GMT</pubDate>
    <dc:creator>ScottBass</dc:creator>
    <dc:date>2017-12-01T01:19:48Z</dc:date>
    <item>
      <title>Parallel processing in EG isn't working as I expected...</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Parallel-processing-in-EG-isn-t-working-as-I-expected/m-p/416930#M26837</link>
      <description>&lt;P&gt;...that doesn't mean my expectations are correct!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Do this in an EG project:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Process Flow A&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp;Program 1&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; data work.class;set sashelp.class;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp;Program 2&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; data work.shoes;set sashelp.shoes;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In file --&amp;gt; Project Properties --&amp;gt; Code Submission, tick Allow parallel execution on the same server&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Run the project.&amp;nbsp; Because these datasets are so small, it may not be evident that the two executions ran in parallel.&lt;/P&gt;&lt;P&gt;But the 1400+ lines of clutter in the log should give some indication (would option nosource be good here?)&lt;/P&gt;&lt;P&gt;Also, EG doesn't try to display the datasets in a dataviewer window, another indicator that it ran on a separate workspace server.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is a trivial example; in my "real code" I'm running about 30 program entries to populate permanent SQL Server tables, where there are no dependencies between the program entries.&amp;nbsp; The parallel processing really improves processing time.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But I then want to run additional, "normal" processing after the parallel processing.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So, in the same EG project:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;New Process Flow&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Process Flow B&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp;Program 3&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; data work.stocks;set sashelp.stocks;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In the Program properties --&amp;gt; Code Submission, tick the "Customize code submission options", and leave both checkboxes unticked.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So this program entry runs on the "default" workspace server.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If you run this, no clutter in the log, and work.stocks displays in the dataviewer window.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Now from the dataviewer window, Filter and Sort, All Variables, Filter Stock = IBM&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This code fails.&amp;nbsp; It appears to be running on a new server IAW the project properties, so work.stocks does not exist in that server.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Questions:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1) Should I expect Filter and Sort to obey the program properties, since the source dataset was created on the default workspace server, not via parallel processing?&lt;/P&gt;&lt;P&gt;2)&amp;nbsp;What is the best way to configure Process Flow A to run all contained programs in parallel, but all other Process Flows sequentially?&amp;nbsp; (Please don't say I have to RMB the 30 program entries in Process Flow A and configure for parallel execution).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'll wait for feedback on my expectations before I create a SASWare Ballot entry to enhance parallel processing configuration to the Process Flow as well as Project and individual program entry level.&lt;/P&gt;</description>
      <pubDate>Wed, 29 Nov 2017 01:42:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/Parallel-processing-in-EG-isn-t-working-as-I-expected/m-p/416930#M26837</guid>
      <dc:creator>ScottBass</dc:creator>
      <dc:date>2017-11-29T01:42:39Z</dc:date>
    </item>
    <item>
      <title>Re: Parallel processing in EG isn't working as I expected...</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Parallel-processing-in-EG-isn-t-working-as-I-expected/m-p/417033#M26848</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/15043"&gt;@ScottBass&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;EG has the ability to manage multiple connections to a single "logical" SAS server (example, "SASApp").&amp;nbsp; It does this behind the scenes with the built-in Data Explorer feature, which submits multiple jobs to gather descriptive stats for a single data set.&amp;nbsp; And then it also does this a bit more explicitly with the "parallel processing" flag for SAS programs.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, there are pitfalls. WORK data sets are one issue, since each SAS workspace has its own WORK/temp space.&amp;nbsp; Your jobs would need to be created to work with a common shared work library -- different than the built-in WORK -- to share the transient data generated among the different programs.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Another "gotcha" is something your admin might notice before you do: multiple SAS workspaces in use by a single EG session.&amp;nbsp; If entire teams work this way, that could be a lot of workspaces -- the management of which the admin can't control centrally.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;That's why SAS Enterprise Guide is also designed to work well with SAS Grid Computing, which provides the benefits of parallel processing in a way that a SAS admin can manage as well.&amp;nbsp; If you use SAS Grid functions/macros within your SAS program, you can tell the SAS Grid which sections of code can run at the same time.&amp;nbsp; SAS Enterprise Guide even provides a code analyzer that can annotate your program with these directives, making it easier to build a job that leverages the grid for parallel processing.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you're looking for efficiency, I think that's the better option.&amp;nbsp; If you're looking for simplicity -- but with parallel processing -- then you'll need to probably limit your scenarios to those that don't share WORK data and don't have too many parallel branches.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;BTW, before parallel processing was an option in EG, many users simply opened multiple EG sessions and worked that way.&amp;nbsp; Too many of those can get confusing, but that method still works too.&lt;/P&gt;</description>
      <pubDate>Wed, 29 Nov 2017 12:54:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/Parallel-processing-in-EG-isn-t-working-as-I-expected/m-p/417033#M26848</guid>
      <dc:creator>ChrisHemedinger</dc:creator>
      <dc:date>2017-11-29T12:54:14Z</dc:date>
    </item>
    <item>
      <title>Re: Parallel processing in EG isn't working as I expected...</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Parallel-processing-in-EG-isn-t-working-as-I-expected/m-p/417634#M26892</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/4"&gt;@ChrisHemedinger&lt;/a&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;&lt;SPAN&gt;EG has the ability to manage multiple connections to a single "logical" SAS server (example, "SASApp").&amp;nbsp; It does this behind the scenes with the built-in Data Explorer feature, which submits multiple jobs to gather descriptive stats for a single data set.&lt;/SPAN&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I'm not quite sure what this means?&amp;nbsp; Would multiple connections to a single logical SAS server run synchronously or asynchronously (parallel)?&amp;nbsp; In any case, for my processing, I'm not interested in descriptive stats for a single data set, unless those descriptive stats add intelligence so that downstream EG tasks cause those tasks to run in the correct environment.&lt;/SPAN&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;&lt;SPAN&gt;And then it also does this a bit more explicitly with the "parallel processing" flag for SAS programs.&lt;/SPAN&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;This appears to do what I want.&amp;nbsp; When I run multiple programs in parallel, it's clear from the "green boxes" and the start/end times that the code runs asynchronously.&amp;nbsp; I assume this is spawning multiple SAS processes behind the scenes, analogous to MP_CONNECT but without all the explicit setup and process management (eg. explict WAITFOR statements).&amp;nbsp; (So it's good that EG makes this easier than setting up connect scripts (which I don't control anyway), etc.)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;&lt;SPAN&gt;Your jobs would need to be created to work with a common shared work library -- different than the built-in WORK -- to share the transient data generated among the different programs.&lt;/SPAN&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;In my "real code", this is the case; the code that runs in parallel is doing explicit passthrough to SQL Server, with both source and target tables residing in SQL Server.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;My issue is that:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;1) For a program entry that has overridden the project default of parallel processing,&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;2) So it runs in the "current" workspace server,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;3) And creates a work dataset in the current workspace server, then&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;4) &lt;STRONG&gt;I would expect a task based on that dataset, such as Filter and Sort, to be smart enough to run in the current workspace server, instead of the project default parallel processing.&amp;nbsp; IMO that would be the user friendly thing to do; be smart enough to know that the work dataset is on the current server, so run downstream code that uses that work dataset as a source table on the current server.&amp;nbsp;&amp;nbsp;Of course, others may disagree.&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;&lt;SPAN&gt;That's why SAS Enterprise Guide is also designed to work well with SAS Grid Computing, ...&amp;nbsp;&amp;nbsp;If you're looking for efficiency, I think that's the better option.&lt;/SPAN&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;We don't have SAS Grid Computing installed.&amp;nbsp; I assume SAS Grid Computing significantly increases license costs?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Do those SAS Grid Computing macros work even if we only have a single SAS server, i.e. can I use them to programmatically execute program entries in parallel?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;&lt;SPAN&gt;If you're looking for simplicity -- but with parallel processing -- then you'll need to probably limit your scenarios to those that don't share WORK data and don't have too many parallel branches.&lt;/SPAN&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Here is the use case that I wish EG (easily) supported.&amp;nbsp; It's based around common ETL practices, of which I'm sure SAS R&amp;amp;D are aware.&amp;nbsp; For those organizations that are too cheap to have a proper scheduler, and want to use EG for this sort of processing (even if just development), this could be a useful workaround.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Project Properties:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;By default NOT setup for parallel processing.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Process Flow:&amp;nbsp; "Setup"&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Set SAS options, SASAUTOS setting, allocate libraries, etc.&amp;nbsp; Whatever global setup you need for your processing.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Process Flow:&amp;nbsp; "Extract"&lt;BR /&gt;Since there are no dependencies on the extract processing, I want this entire process flow to run in parallel.&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;So I RMB the Process Flow properties (which IMO are currently pretty useless), and select the mythical "Run this &lt;STRONG&gt;process flow&lt;/STRONG&gt; in parallel" option.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I do my delta extract (say any rows changed in the last week) from source to staging tables.&amp;nbsp; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;In my RDBMS environment, this would be explicit passthrough to the RDBMS, so SAS is just a client submitting code.&amp;nbsp; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;In a SAS environment, this would extract SAS data from SAS source datasets to SAS staging datasets.&amp;nbsp; They would have to be permanent SAS datasets since the code is running in parallel.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I wish EG could "inject" one or more program entries and/or the process flow (i.e. "Setup") into the parallel jobs.&amp;nbsp; A bit analogous to the Autoexec process flow option, but for parallel processing.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Alternatively, use the Link functionality to have a setup program that links to all the downstream parallel program.&amp;nbsp; So you'd need the ability to have multiple links downstream from the setup program.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;A workaround is to save the "Setup" code into one or more external files, and %include those files in the parallel extract programs.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Process Flow:&amp;nbsp; "Transform"&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;This will have process order dependencies, so I need this code to run synchronously.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;For example, I first prepare all the dimension tables, then prepare the fact tables with the surrogate keys / foreign keys from the dim tables.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Process Flow:&amp;nbsp; "Load"&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;There are no dependencies between tables, so I want to run these programs in parallel.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I do a similar setup as per the "Extract" process flow.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 01 Dec 2017 01:19:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/Parallel-processing-in-EG-isn-t-working-as-I-expected/m-p/417634#M26892</guid>
      <dc:creator>ScottBass</dc:creator>
      <dc:date>2017-12-01T01:19:48Z</dc:date>
    </item>
  </channel>
</rss>

