SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

'New Transformation' Question

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 89
Accepted Solution

'New Transformation' Question

I've got a question on creating a 'New Transformation' within DI Studio.

 

I'm writing a bit of code that will do an analysis on two source columns to produce a value on a new target column. For example:

 

proc sql;
    update &SYSLAST
        set column_flag =
            case when column_a - column_b = 0 then 'Y'
                 else 'N'
       end;
quit;
data &_OUTPUT;
    set &_INPUT;
run;

I've created 'column_flag' as a target column in the mappings, but when running the node I get an error saying that 'column_flag' isn't in the source work fil. Now, I can go back and put it as a source column, but it wouldn't make sense as I'd like this column to only exist after this transformation does its analysis.

 

Am I missing somethig in the 'New Transformation' options that release the requirement of the target column to also be in the source column?

 

Thanks!


Accepted Solutions
Solution
‎05-23-2016 01:13 PM
Contributor
Posts: 53

Re: 'New Transformation' Question

You could use an ALTER TABLE statement in your PROC SQL and ADD the new column to the &SYSLAST table before you update it.

 

PROC SQL;

     ALTER TABLE &SYSLAST

          ADD COLUMN_FLAG CHAR(1) FORMAT=$1.;

    UPDATE &SYSLAST

         . . . your update code . . .

QUIT;

View solution in original post


All Replies
Super User
Posts: 5,426

Re: 'New Transformation' Question

Typically you don't update source tables on transformations. You let the data flow from sources to targets (of course there are exceptions to this rule).
Not knowing the purpose of the whole job, but I can't see why this couldn't be done I'm the standard transformation Extract. Then you would omit the inconsistency of source vs target columns in your current approach.
Data never sleeps
Frequent Contributor
Posts: 89

Re: 'New Transformation' Question

Originally, this was done in an extract where, indeed, it worked fine, but we need to now have this logic placed in many jobs going forward. The idea was to create a new transformation with this logic that we could simply incorporate into any job. Obvious benefit being if the logic ever changes, we only change it in one place. Plus, we prevent potential developer inconsistencies on this piece.

 

Maybe a new transformation isn't the way to go. Is there a parameter or some variable we could build and incorporate in the expressions? Another approach?

 

Basically, we just want to do some simple calcualtion on a year and set a flag if it is the most recent year or not. Should be simple, but don't want to repeat hardcoded logic all over the warehouse.

Solution
‎05-23-2016 01:13 PM
Contributor
Posts: 53

Re: 'New Transformation' Question

You could use an ALTER TABLE statement in your PROC SQL and ADD the new column to the &SYSLAST table before you update it.

 

PROC SQL;

     ALTER TABLE &SYSLAST

          ADD COLUMN_FLAG CHAR(1) FORMAT=$1.;

    UPDATE &SYSLAST

         . . . your update code . . .

QUIT;

Super User
Posts: 5,426

Re: 'New Transformation' Question

I think that we could a nice long chat about the principle of doing such column transformation in many jobs. But answering your question, I think a generated transformation is one way to go.
On thing that you can't do (but it would be very nice if you could) is to enforce output columns from the transformation. So you need to have this as parameters and instruct your developers to create a flag output column from the transformation, and chose that as parameter, along what columns to use in calculation (column_a and column_b in your example).

Another option is to use an in line macro. One drawback with this approach is that you don't see the transformation in Metadata or in the deployed code.
Data never sleeps
Frequent Contributor
Posts: 89

Re: 'New Transformation' Question

Sorry for the delayed response here. I was out on vacation for my wife's 40th.

 

Anyway, the Alter Table statement does the trick, however I've got one other question.

 

In order to do this, not only do I need to use the Alter Table statement, but I also need to actually create the column in the target side of the mapping. So, beyond using '

ALTER TABLE &SYSLAST

          ADD COLUMN_FLAG CHAR(1) FORMAT=$1.;'

I need to physically create a column called 'COLUMN_FLAG' and set the character (1) in order for this to work.

 

The question is:

1a) Is there a way for this code to work within the node that doesn't also require the developer to create the same column in the target?

1b) Even if there is...I would think that would mean that if a developer were to look at the columns in the mapping they could make the mistake of not being aware that this column is there since it's only generated in the code.

 

It may be a mute point though as the nature of DI Studio may require that column to be in the mapping in order for it to be used, thus we have to create it. Well, that's been my experience so far.

 

Thanks!

 

 

Contributor
Posts: 53

Re: 'New Transformation' Question

I am not aware of a way to create and then update the column in code without the column existing in the Mappings for the Target. I have created the column in the Target but it isn't mapped to any existing field in the Source. For subsequent DI jobs that needed this column in the Target I am able to import it from my warehouse table that contains the field.

 

In my process I am adding and updating the field in a User Written Code node that is in the middle of the DI process. In order for the column to be loaded into my final Target table I did have to Alter that table to add the column and then Update Metadata so that it would appear in the final Target mapping in the Loader object.

 

The benefit to using the ALTER TABLE statement with my WORK table is that the values have to be updated using a specific algorithm which I can use in the UPDATE statement so that when the process gets to the Loader node the field has the appropriate values.

Frequent Contributor
Posts: 89

Re: 'New Transformation' Question

Thanks, USFAngel, that's the same behavior I'm seeing as well.

Contributor
Posts: 68

Re: 'New Transformation' Question

Before writing the user written code, Just in mapping , create a column, name column_flag, and connect your both source tables in user written and map both columns in column_flag.
Then in column_flag expression, you can write this expression:

 

case when column_a - column_b = 0 then 'Y'
                 else 'N'
       end;

 Hope, this will work out....

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 1299 views
  • 0 likes
  • 4 in conversation