Hi All,
Sorry for the late post this week.
Version 9.2 introduced several new features for the User Written Code Transformation and Generated Transformations. Among the highlights are:
User Written Code Transformations may have 0:M inputs and outputs. Prior to version 9.2, the UWC transformation always had 1 input and 1 output (whether you wanted them or not;)). As of version 9.2, you can specify as many or as few inputs and outputs as you wish for the UWC transformation.
A new prompting system was added to Generated Transformations. This allows you to build cascading prompts and to retreive your prompt values from a SAS table or a database, among many flexible features.
Information on the contents of the Mapping tab are passed to both the User Written Code transformation and Generated Transformations.
This last item is the focus of this article.
In previous versions of DI Studio, both the UWC transformation and Generated Transformations had a mapping tab as they do now. Prior to 9.2, the DI Studio developer had no way to access the information in the mapping tab, which the exception of the field list, which was surfaced in the "&keep" macro variable.
The presence of the mapping tab for the UWC transformation and Generated Transformations caused confusion at many customer sites I visted. I frequently had to explain to the customer that my custom code didn't know or care about the Mapping tab. "Oh, except that you have to make sure you create the right output fields!. And don't worry about about those mapping lines here - I can't see them in the custom code!" With every new project and each new developer on the same project, the pattern repeated.
Fortunately SAS has heard from our users and delivered the ability to see the properties in the Mapping tab from user written code! This means that users of Generated Transformations you create can put field mappings and expressions in the Mapping tab just like they do with other transformations and the Mapping tab is no longer just for show;). The same is true for the UWC Transformation.
If you look at the default code for a UWC transform or a Generated Transform, you'll see something like the following (ellipses added for brevity):
[pre]
%let _INPUT_count = 2;
%let _INPUT1 = sasstg.PRODUCTS;
%let _INPUT1_connect = ;
%let _INPUT1_engine = BASE;
%let _INPUT1_memtype = DATA;
%let _INPUT1_options = %nrquote();
%let _INPUT1_alter = %nrquote();
%let _INPUT1_path = /User Folders/tistea/My Folder/DI_Forum_Posts/Expressions_To_UGT/PRODUCTS(Table);
%let _INPUT1_type = 1;
%let _INPUT1_label = %nrquote();
...
%let _OUTPUT_count = 1;
%let _OUTPUT1 = sasstg.PROD_SALE;
%let _OUTPUT1_connect = ;
%let _OUTPUT1_engine = BASE;
%let _OUTPUT1_memtype = DATA;
%let _OUTPUT1_options = %nrquote();
%let _OUTPUT1_alter = %nrquote();
%let _OUTPUT1_path = /User Folders/tistea/My Folder/DI_Forum_Posts
...
/Expressions_To_UGT/PROD_SALE(Table);
%let _OUTPUT1_type = 1;
%let _OUTPUT1_label = %nrquote();
/* List of target columns to keep */
%let _OUTPUT1_keep = PROD_SUBCATEGORY PROD_CATEGORY QUANTITY_SOLD AMOUNT_SOLD;
%let _OUTPUT1_col_count = 4;
%let _OUTPUT1_col0_name = PROD_SUBCATEGORY;
%let _OUTPUT1_col0_table = sasstg.PROD_SALE;
%let _OUTPUT1_col0_length = 50;
%let _OUTPUT1_col0_type = $;
%let _OUTPUT1_col0_format = $50.;
%let _OUTPUT1_col0_informat = $50.;
%let _OUTPUT1_col0_label = %nrquote(PROD_SUBCATEGORY);
%let _OUTPUT1_col0_input0 = PROD_SUBCATEGORY;
%let _OUTPUT1_col0_exp = UPCASE(PROD_SUBCATEGORY);
%let _OUTPUT1_col0_input = PROD_SUBCATEGORY;
%let _OUTPUT1_col0_input_count = 1;
%let _OUTPUT1_col1_name = PROD_CATEGORY;
%let _OUTPUT1_col1_table = sasstg.PROD_SALE;
%let _OUTPUT1_col1_length = 50;
%let _OUTPUT1_col1_type = $;
%let _OUTPUT1_col1_format = $50.;
%let _OUTPUT1_col1_informat = $50.;
%let _OUTPUT1_col1_label = %nrquote(PROD_CATEGORY);
%let _OUTPUT1_col1_input0 = PROD_CATEGORY;
%let _OUTPUT1_col1_exp = ;
%let _OUTPUT1_col1_input = PROD_CATEGORY;
%let _OUTPUT1_col1_input_count = 1;
...
[/pre]
That's a lot of code and it contains several different types of information:
The number of input tables for the transformation.
The number of output tables for the transformation.
The two level (library.table) name for each input and output table.
The number of output columns for each output table.
The type, length, informat, format and label for each output column.
Mapping information for each column. If the mapping is simply a column name, the mapped column name is available in the _OUTPUT(X)_col(y)_input macro variable. If an expression was specified, the full expression is available in the _OUTPUT(X)_col(y)_exp macro variable.
Using this information, one can completely parameterize the creation of columns in output tables and the basic mappings and expressions supplied in the Mapping tab. The question is, how does one use all of these macro variables to parameterize column creation and mapping? The answer, is of course, macro code. But not just any macro code. We're talking about the fun kind with 3, perhaps 4 variable dereferences like &&&my_var_&i. Fun stuff! Fortunately, I've decided to write a macro to handle this for Data Step based transformations and have attached it to this article.
Here is the macro definition:
%macro dataStepStmts(outputNbr,mode);
The macro can be used the generate LENGTH, INFORMAT, FORMAT, LABEL or mapping statements. The first parameter indicates which output table (numbered 1- ) the statements are being generated for. The second parameter - mode - indicates the type of statements to create and can take on the following values:
LENGTH
INFORMAT
FORMAT
LABEL
MAP
I hope the result of each possible value for the "mode" parameter is self explanatory.
To create a complete table definition and perform mappings with this macro, one must call it several times with several different modes. The DATA Step below shows an example usage:
[pre]
data &_output1;
/* Create length statements */
%dataStepStmts(1,LENGTH);
/* Create informat statements */
%dataStepStmts(1,INFORMAT);
/* Create format statements */
%dataStepStmts(1,FORMAT);
/* Create label statements */
%dataStepStmts(1,LABEL);
merge &_input1(in=in1)
&_input2(in=in2);
by PROD_ID;
if in1 and in2;
/* Process mappings */
%dataStepStmts(1,MAP);
run;
[/pre]
The program above is a simple merge of two tables, where records should only be outputted if there is a join between the tables. Notice that the %dataStepStmts macro is invoked several times, one each for LENGTH, INFORMAT, FORMAT, LABEL and column mapping. "Normal" code is interspersed with the macro code to implement the business logic of the transformation.
To use this macro, simply place the file in the "SASEnvironment\SASMacro" folder for your application server context on the same machine as teh Workspace Server so that the macro is loaded as an autocall macro. On my Windows machine, I placed the file at "C:\SAS\Config\Lev1\SASApp\SASEnvironment\SASMacro", but this is installation dependent.
So, there you have it - a useful Mapping tab for User Written Code transformations and Generated Transformations. Now I can finally quit explaining which parts of the Mapping tab you should use and which to ignore - just use all of it - whew!
Feel free to use the attached macro for your projects. I plan to post a similar macro for PROC SQL based custom code transformations next week.
Thanks,
Tim Stearn
... View more