BookmarkSubscribeRSS Feed
Hi All,



Sorry for the late post this week.



Version 9.2 introduced several new features for the User Written Code Transformation and Generated Transformations. Among the highlights are:

  1. User Written Code Transformations may have 0:M inputs and outputs. Prior to version 9.2, the UWC transformation always had 1 input and 1 output (whether you wanted them or not;)). As of version 9.2, you can specify as many or as few inputs and outputs as you wish for the UWC transformation.
  2. A new prompting system was added to Generated Transformations. This allows you to build cascading prompts and to retreive your prompt values from a SAS table or a database, among many flexible features.
  3. Information on the contents of the Mapping tab are passed to both the User Written Code transformation and Generated Transformations.


This last item is the focus of this article.



In previous versions of DI Studio, both the UWC transformation and Generated Transformations had a mapping tab as they do now. Prior to 9.2, the DI Studio developer had no way to access the information in the mapping tab, which the exception of the field list, which was surfaced in the "&keep" macro variable.



The presence of the mapping tab for the UWC transformation and Generated Transformations caused confusion at many customer sites I visted. I frequently had to explain to the customer that my custom code didn't know or care about the Mapping tab. "Oh, except that you have to make sure you create the right output fields!. And don't worry about about those mapping lines here - I can't see them in the custom code!" With every new project and each new developer on the same project, the pattern repeated.



Fortunately SAS has heard from our users and delivered the ability to see the properties in the Mapping tab from user written code! This means that users of Generated Transformations you create can put field mappings and expressions in the Mapping tab just like they do with other transformations and the Mapping tab is no longer just for show;). The same is true for the UWC Transformation.



If you look at the default code for a UWC transform or a Generated Transform, you'll see something like the following (ellipses added for brevity):
[pre]
%let _INPUT_count = 2;
%let _INPUT1 = sasstg.PRODUCTS;
%let _INPUT1_connect = ;
%let _INPUT1_engine = BASE;
%let _INPUT1_memtype = DATA;
%let _INPUT1_options = %nrquote();
%let _INPUT1_alter = %nrquote();
%let _INPUT1_path = /User Folders/tistea/My Folder/DI_Forum_Posts/Expressions_To_UGT/PRODUCTS(Table);
%let _INPUT1_type = 1;
%let _INPUT1_label = %nrquote();
...
%let _OUTPUT_count = 1;
%let _OUTPUT1 = sasstg.PROD_SALE;
%let _OUTPUT1_connect = ;
%let _OUTPUT1_engine = BASE;
%let _OUTPUT1_memtype = DATA;
%let _OUTPUT1_options = %nrquote();
%let _OUTPUT1_alter = %nrquote();
%let _OUTPUT1_path = /User Folders/tistea/My Folder/DI_Forum_Posts
...
/Expressions_To_UGT/PROD_SALE(Table);
%let _OUTPUT1_type = 1;
%let _OUTPUT1_label = %nrquote();
/* List of target columns to keep */
%let _OUTPUT1_keep = PROD_SUBCATEGORY PROD_CATEGORY QUANTITY_SOLD AMOUNT_SOLD;
%let _OUTPUT1_col_count = 4;
%let _OUTPUT1_col0_name = PROD_SUBCATEGORY;
%let _OUTPUT1_col0_table = sasstg.PROD_SALE;
%let _OUTPUT1_col0_length = 50;
%let _OUTPUT1_col0_type = $;
%let _OUTPUT1_col0_format = $50.;
%let _OUTPUT1_col0_informat = $50.;
%let _OUTPUT1_col0_label = %nrquote(PROD_SUBCATEGORY);
%let _OUTPUT1_col0_input0 = PROD_SUBCATEGORY;
%let _OUTPUT1_col0_exp = UPCASE(PROD_SUBCATEGORY);
%let _OUTPUT1_col0_input = PROD_SUBCATEGORY;
%let _OUTPUT1_col0_input_count = 1;
%let _OUTPUT1_col1_name = PROD_CATEGORY;
%let _OUTPUT1_col1_table = sasstg.PROD_SALE;
%let _OUTPUT1_col1_length = 50;
%let _OUTPUT1_col1_type = $;
%let _OUTPUT1_col1_format = $50.;
%let _OUTPUT1_col1_informat = $50.;
%let _OUTPUT1_col1_label = %nrquote(PROD_CATEGORY);
%let _OUTPUT1_col1_input0 = PROD_CATEGORY;
%let _OUTPUT1_col1_exp = ;
%let _OUTPUT1_col1_input = PROD_CATEGORY;
%let _OUTPUT1_col1_input_count = 1;
...
[/pre]

That's a lot of code and it contains several different types of information:

  1. The number of input tables for the transformation.
  2. The number of output tables for the transformation.
  3. The two level (library.table) name for each input and output table.
  4. The number of output columns for each output table.
  5. The type, length, informat, format and label for each output column.
  6. Mapping information for each column. If the mapping is simply a column name, the mapped column name is available in the _OUTPUT(X)_col(y)_input macro variable. If an expression was specified, the full expression is available in the _OUTPUT(X)_col(y)_exp macro variable.


Using this information, one can completely parameterize the creation of columns in output tables and the basic mappings and expressions supplied in the Mapping tab. The question is, how does one use all of these macro variables to parameterize column creation and mapping? The answer, is of course, macro code. But not just any macro code. We're talking about the fun kind with 3, perhaps 4 variable dereferences like &&&my_var_&i. Fun stuff! Fortunately, I've decided to write a macro to handle this for Data Step based transformations and have attached it to this article.



Here is the macro definition:

%macro dataStepStmts(outputNbr,mode);




The macro can be used the generate LENGTH, INFORMAT, FORMAT, LABEL or mapping statements. The first parameter indicates which output table (numbered 1-) the statements are being generated for. The second parameter - mode - indicates the type of statements to create and can take on the following values:

  • LENGTH
  • INFORMAT
  • FORMAT
  • LABEL
  • MAP

I hope the result of each possible value for the "mode" parameter is self explanatory.



To create a complete table definition and perform mappings with this macro, one must call it several times with several different modes. The DATA Step below shows an example usage:

[pre]
data &_output1;
/* Create length statements */
%dataStepStmts(1,LENGTH);
/* Create informat statements */
%dataStepStmts(1,INFORMAT);
/* Create format statements */
%dataStepStmts(1,FORMAT);
/* Create label statements */
%dataStepStmts(1,LABEL);
merge &_input1(in=in1)
&_input2(in=in2);
by PROD_ID;
if in1 and in2;
/* Process mappings */
%dataStepStmts(1,MAP);
run;
[/pre]



The program above is a simple merge of two tables, where records should only be outputted if there is a join between the tables. Notice that the %dataStepStmts macro is invoked several times, one each for LENGTH, INFORMAT, FORMAT, LABEL and column mapping. "Normal" code is interspersed with the macro code to implement the business logic of the transformation.



To use this macro, simply place the file in the "SASEnvironment\SASMacro" folder for your application server context on the same machine as teh Workspace Server so that the macro is loaded as an autocall macro. On my Windows machine, I placed the file at "C:\SAS\Config\Lev1\SASApp\SASEnvironment\SASMacro", but this is installation dependent.



So, there you have it - a useful Mapping tab for User Written Code transformations and Generated Transformations. Now I can finally quit explaining which parts of the Mapping tab you should use and which to ignore - just use all of it - whew!



Feel free to use the attached macro for your projects. I plan to post a similar macro for PROC SQL based custom code transformations next week.



Thanks,



Tim Stearn
5 REPLIES 5
jklaverstijn
Rhodochrosite | Level 12
Hi Tim,

Thanks for his excellent explanation. These macro variables are extremely flexible and we use them a lot.

There is one thing that keeps me pondering and has kept me from fully utilising the column mapping metadata in my transformations: suppose I have more then one input table and columns names are not unique. In that case a construct like

%let _OUTPUT1_col0_input = PROD_SUBCATEGORY;

is ambigious as I cannot tell what table provided this column. I have looked in vain for something like

%let _OUTPUT1_col0_input0_table = 1;

Do you have any suggestions how can I deal with that?

Regards Jan.
LinusH
Tourmaline | Level 20

I'm not sure on how you need to use this.

Depending on your issue, but maybe using _output_col0_exp might help in some situations?

I have also felt the need to have more help with the input definitions, but I have more looked for something corresponding to all that is set up for output.

Data never sleeps
jklaverstijn
Rhodochrosite | Level 12

Hi Linus,

Using _output_col0_exp actually shows my problem. It contains an unqualified reference to the mapped column so the originating table cannot be established. Bottom line is I think there is no straightforward solution.

Anyways I am still looking for the macro code. I once downloaded it but lost the file and now the link is broken. Anyone?

Regards Jan

LinusH
Tourmaline | Level 20

I guess you are looking for a way to see the column mapping as macro variables?

This could be very handy, but the question is how to use it right off? The method of choosing input data is very different between SQL and the  data step, for instance, so it would still require some additional macro programming.

Data never sleeps
jklaverstijn
Rhodochrosite | Level 12

Hi Tim,

I would like to download the file but the link seems broken. I get some sort of server error. Can you help?

TIA, Jan.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 4830 views
  • 0 likes
  • 3 in conversation