BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jwhite
Quartz | Level 8

I've got a question on creating a 'New Transformation' within DI Studio.

 

I'm writing a bit of code that will do an analysis on two source columns to produce a value on a new target column. For example:

 

proc sql;
    update &SYSLAST
        set column_flag =
            case when column_a - column_b = 0 then 'Y'
                 else 'N'
       end;
quit;
data &_OUTPUT;
    set &_INPUT;
run;

I've created 'column_flag' as a target column in the mappings, but when running the node I get an error saying that 'column_flag' isn't in the source work fil. Now, I can go back and put it as a source column, but it wouldn't make sense as I'd like this column to only exist after this transformation does its analysis.

 

Am I missing somethig in the 'New Transformation' options that release the requirement of the target column to also be in the source column?

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
UCFAngel
Obsidian | Level 7

You could use an ALTER TABLE statement in your PROC SQL and ADD the new column to the &SYSLAST table before you update it.

 

PROC SQL;

     ALTER TABLE &SYSLAST

          ADD COLUMN_FLAG CHAR(1) FORMAT=$1.;

    UPDATE &SYSLAST

         . . . your update code . . .

QUIT;

View solution in original post

8 REPLIES 8
LinusH
Tourmaline | Level 20
Typically you don't update source tables on transformations. You let the data flow from sources to targets (of course there are exceptions to this rule).
Not knowing the purpose of the whole job, but I can't see why this couldn't be done I'm the standard transformation Extract. Then you would omit the inconsistency of source vs target columns in your current approach.
Data never sleeps
jwhite
Quartz | Level 8

Originally, this was done in an extract where, indeed, it worked fine, but we need to now have this logic placed in many jobs going forward. The idea was to create a new transformation with this logic that we could simply incorporate into any job. Obvious benefit being if the logic ever changes, we only change it in one place. Plus, we prevent potential developer inconsistencies on this piece.

 

Maybe a new transformation isn't the way to go. Is there a parameter or some variable we could build and incorporate in the expressions? Another approach?

 

Basically, we just want to do some simple calcualtion on a year and set a flag if it is the most recent year or not. Should be simple, but don't want to repeat hardcoded logic all over the warehouse.

UCFAngel
Obsidian | Level 7

You could use an ALTER TABLE statement in your PROC SQL and ADD the new column to the &SYSLAST table before you update it.

 

PROC SQL;

     ALTER TABLE &SYSLAST

          ADD COLUMN_FLAG CHAR(1) FORMAT=$1.;

    UPDATE &SYSLAST

         . . . your update code . . .

QUIT;

LinusH
Tourmaline | Level 20
I think that we could a nice long chat about the principle of doing such column transformation in many jobs. But answering your question, I think a generated transformation is one way to go.
On thing that you can't do (but it would be very nice if you could) is to enforce output columns from the transformation. So you need to have this as parameters and instruct your developers to create a flag output column from the transformation, and chose that as parameter, along what columns to use in calculation (column_a and column_b in your example).

Another option is to use an in line macro. One drawback with this approach is that you don't see the transformation in Metadata or in the deployed code.
Data never sleeps
jwhite
Quartz | Level 8

Sorry for the delayed response here. I was out on vacation for my wife's 40th.

 

Anyway, the Alter Table statement does the trick, however I've got one other question.

 

In order to do this, not only do I need to use the Alter Table statement, but I also need to actually create the column in the target side of the mapping. So, beyond using '

ALTER TABLE &SYSLAST

          ADD COLUMN_FLAG CHAR(1) FORMAT=$1.;'

I need to physically create a column called 'COLUMN_FLAG' and set the character (1) in order for this to work.

 

The question is:

1a) Is there a way for this code to work within the node that doesn't also require the developer to create the same column in the target?

1b) Even if there is...I would think that would mean that if a developer were to look at the columns in the mapping they could make the mistake of not being aware that this column is there since it's only generated in the code.

 

It may be a mute point though as the nature of DI Studio may require that column to be in the mapping in order for it to be used, thus we have to create it. Well, that's been my experience so far.

 

Thanks!

 

 

UCFAngel
Obsidian | Level 7

I am not aware of a way to create and then update the column in code without the column existing in the Mappings for the Target. I have created the column in the Target but it isn't mapped to any existing field in the Source. For subsequent DI jobs that needed this column in the Target I am able to import it from my warehouse table that contains the field.

 

In my process I am adding and updating the field in a User Written Code node that is in the middle of the DI process. In order for the column to be loaded into my final Target table I did have to Alter that table to add the column and then Update Metadata so that it would appear in the final Target mapping in the Loader object.

 

The benefit to using the ALTER TABLE statement with my WORK table is that the values have to be updated using a specific algorithm which I can use in the UPDATE statement so that when the process gets to the Loader node the field has the appropriate values.

jwhite
Quartz | Level 8

Thanks, USFAngel, that's the same behavior I'm seeing as well.

Riteshdell
Quartz | Level 8

Before writing the user written code, Just in mapping , create a column, name column_flag, and connect your both source tables in user written and map both columns in column_flag.
Then in column_flag expression, you can write this expression:

 

case when column_a - column_b = 0 then 'Y'
                 else 'N'
       end;

 Hope, this will work out....

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 2199 views
  • 0 likes
  • 4 in conversation