BookmarkSubscribeRSS Feed
David_Billa
Rhodochrosite | Level 12

I want to add one User written code transformation as node 1 in all the DI jobs under one particular folder in DI Studio jobs. I don't want to touch all the DI jobs for adding one transformation as I have to redeploy all and test all the jobs.

 

Is there a way to make this approach easy by adding user written code transformation in one place and it should execute it for all the other jobs in same folder? 

 

User written code transformation contains few macro variables. Appreciate if some of you could shed some light on this.

13 REPLIES 13
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @David_Billa 

 

It would be very difficult to add a node to all jobs, I would say impossible. It could be done by adding the code as job-precode with data step metadata functions .

 

A usable approach would be to manually add the code  to a single job and use the metadata browser in Display Manager to map the associations and attributes, Then write a metadata extract to get ID's of all jobs in the folder, and write code to build similar structures for all jobs. But it would be a lot of work and highly dangerous, if you are not familiar with updating metadata programmatically. So unless you have hundreds of jobs, I wouldn't recommend it.

David_Billa
Rhodochrosite | Level 12

Job pre-code with metadata data functions don't require redeployment? 

Patrick
Opal | Level 21

@David_Billa 

Adding a node or pre-code to a DIS job however done is a change to the job and would required redeployment and regression testing.

 

If it's just about defining and populating a few macro variables then you could also do this in the autoexec (the usermods version) - but this then applies to all jobs using this server context.

 

Alternatively the system option  INITSTMT allows you to run SAS code right before the DIS deployed code executes. You would need to define a new SAS Data Step Batch Server in SAS Metadata and then assign all jobs in your folder to this new server. I believe you still would have to redeploy the jobs so that the new batch command line gets written to the deployment objects.

 

For the autoexec approach

If your jobs follow some naming convention or get stored in a separate deployment folder then you could also add some macro logic to the autoexec to only define the macro variables for this set of jobs (based on what gets passed to sysin).

 

Unless set in the .cfg the initstmt option has the disadvantage that the macro variables only get set when the jobs run in batch but not when executed out of DIS.

 

I normally define some INIT custom transformation which calls an autoexec macro and use this as first node in DIS jobs. Any later change then only requires a change to the macro definition without the need to change/redeploy any of the DIS jobs (and that's why the custom transformation only calls the macro but the code is kept outside in an autocall macro).

....you could have this of course also as job pre-code but I prefer to have all my code visible in nodes.

I don't really like using such an INIT node but it's often too hard to get permission to change the autoexec while the INIT node and autocall macro is always fully under my control.

 

 

 

  

David_Billa
Rhodochrosite | Level 12

Yes, I don't have permission to update the autoexec file. If you were in my situation, how will you tackle it? Which method you will prefer?

Patrick
Opal | Level 21

 


@David_Billa wrote:

Yes, I don't have permission to update the autoexec file. If you were in my situation, how will you tackle it? Which method you will prefer?


That depends on the details.

What's the purpose of creating these macro variables? Could they in any way impact on the current code and how it executes?

 

As a general approach I'd got for the INIT custom transformation and then limit regression testing to the minimum.

Unless there are hundreds of jobs requiring change I would add the transformation manually to the DIS jobs.

IF it's hundreds of jobs then I'd consider a coded change (writing to SAS Metadata) and then I'd likely go for adding a macro call to DIS job pre-code (as that's easier to code and a smaller change).

 

I've used the INITSTMT approach exactly once in a real project (TERMSTMT actually) where after we've already built and tested everything "someone" added a requirement for collection of job logging using already built macros. So here I've just added the macro call to the batch command using TERMSTMT.

 

David_Billa
Rhodochrosite | Level 12

Could you please provide one example using INIT custom transformation and INITSTMT approach? 

 

Macro variables in node 1 to denote about source data and to decide the library for target dataset. Values for the macro variables will come from scheduler tool. 

Patrick
Opal | Level 21

@David_Billa wrote:

Macro variables in node 1 to denote about source data and to decide the library for target dataset. Values for the macro variables will come from scheduler tool. 


Hmm... Are you getting macro variables or environment variables?

Do the values give you librefs or path-names used in a library statement? And if path-names does that mean you're defining the libnames in code or do you have metadata library definitions using macro variables as path-names?

 

Just guessing what you're dealing with but what I've done in the past when I needed some macro variables set during DIS development which then later on got set upstream and I had to ensure not to overwrite them (if set): I've implemented an autocall macro as below.

%macro init();
  %if not %symexist(mymvar) %then
    %do;
      %global myvar;
      %if %nrbquote(&myvar)=%nrbquote() %then
        %do;
          %let myvar=default value used during development;
        %end;
    %end;
%mend;

I then call the autocall macro in my DIS job (either as job pre-code or via a custom transformation:

%init();

With a reusable custom transformation above is the only bit of code you add there.

 

If upstream provides not macro variables but environment variables then just use the sysexist() and sysget() functions.

David_Billa
Rhodochrosite | Level 12

Thanks for your help. To answer your question, it's a macro variable.

 

One question which I have is where should I define the macro definition portion(%macro... %mend) ? In a pre-code? 

Patrick
Opal | Level 21

I would define it in a SAS Macro Autocall library as a .sas file (.sas file-name must be the same as the macro definition it contains; file-name all in lowercase). 

The valid folder locations are stored in option SASAUTOS. Folder .../<application server>/SASEnvironment/SASMacro is normally part of the SAS Autocall facility so store the .sas file there.

 

In the DIS job only have the call of the macro (pre-code or custom transformation). This allows you to amend the macro definition outside of the DIS jobs but whatever you change gets then immediately active in all the DIS jobs without the need to redeploy them.

Patrick
Opal | Level 21

And just as a variation of the above: You can also store the parameter definitions in a control table. This way you also don't need to change the macro code but you just update control data. This is often much easier to promote than a code change.

I'd implement the code creating the parameter table as an adhoc job in DIS with a single user written code node.

/* control library defined in SAS metadata */
libname ctrl "%sysfunc(pathname(work))";

/* parameter table; sample structure */
data ctrl.param;
  infile datalines truncover dlm='|';
  input name :$32. value :$100.;
  datalines;
mvar1|value1
mvar2|value2
mvar3|value3
;

/* macro definition processing parameter table 
   - store code in folder which is part of SAS Autocall Facility
   - .sas file with same name as macro definition and all in lowercase
     -> init.sas
*/
%macro init(paramTbl=ctrl.param);
  data _null_;
    set &paramTbl;
    if not symexist(strip(name)) then
      call symputx(name,value,'g');
    else if missing(symget(strip(name))) then
      call symputx(name,value,'g');
  run;
%mend;


/* call of macro in DIS code: %init()
   - as pre-code or via custom transformation
*/
/*%init()*/


/* test the macro call */
%let mvar1=preset value;
%let mvar2=;
%init()
%put &=mvar1;
%put &=mvar2;
%put &=mvar3;

 

65         %put &=mvar1;
MVAR1=preset value
66         %put &=mvar2;
MVAR2=value2
67         %put &=mvar3;
MVAR3=value3

 

David_Billa
Rhodochrosite | Level 12

It is not required to redeploy the job if we update pre-code in DIS job? 

Patrick
Opal | Level 21

@David_Billa wrote:

It is not required to redeploy the job if we update pre-code in DIS job? 


Yes, it is. But given that you can't change the autoexec or .cfg I can't think of any way to add such logic without the need of re-deploying the jobs. The only thing I can offer is an approach where you won't have to do this more than once even if you have to add more parameters later on - and a way of doing things in the future from start in a way so this doesn't happen to you again.

ErikLund_Jensen
Rhodochrosite | Level 12

Hi @David_Billa 

 

The only way to avoid redeployment is to have the code executed before the job starts, but is requires changes to executing environment. It can be done in a macro specified as init-statement to the SAS command or in your batch server usermods-autoexec-file. In either case it will affect all jobs, not only the jobs under the particular folder, unless you define a new batch server, and in this case you must also find a way to have these particular jobs executed by the alternative batch server. That's quite a task for your SAS administrator and makes the whole setup more complicated to maintain, so I guess it would be difficult to get through with.

 

That leaves you with changing the jobs. Though it can be done programmatically, I think it is simpler and quicker to do it manually. The existence of precode is not obvious when a job is opened, so from a maintenance perspective, the "nice" way is to put a macro in your autocall library as @Patrick  suggests, create a user written transformation to call the macro, and manually add the transformation to all relevant jobs.

 

And the jobs must be redeployed. I recently had to redeploy more than 3000 jobs, but the SAS installation comes with a script to redeploy in batch. It is located in <your-SASHOME-path>/SASDataIntegrationStudioServerJARs/4.8/DeployJobs.sh. I used this call:

 

sh "<your-SASHOME-path>/SASDataIntegrationStudioServerJARs/4.8/DeployJobs.sh" -host "hhhh" -port "1234" -user "xxxx" -password "yyyy" -deploytype "redeploy" -objects "jobname1/jobname2/jobname3" -sourcedir "<your-batchjob-folder>" -deploymentdir "<your-batchjob-folder>" -metarepository "Foundation" -appservername "DIApp" -log "<logfile>"

 

I had all job names in a data set and used a program to write a new script with a long series of calls, each with max. 10 jobs in each call, because it didn't work properly with too many jobs in the same call. It took a couple of hours, but worked as expected.

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 13 replies
  • 1907 views
  • 5 likes
  • 3 in conversation