We’re smarter together. Learn from this collection of community knowledge and add your expertise.

DataFlux Expression Engine Language (EEL) for SAS Programmers, Part 2

by SAS Employee StephenFoerster on ‎03-22-2017 03:05 PM (772 Views)

When we last looked at this topic, we looked at how we can leverage our SAS DATA Step programming skills to build DataFlux EEL code. Last time we looked at basic field manipulation as well as creating multiple output rows from single input rows.

 

Let’s continue down this path and look at combining data from multiple input rows into a single output row. There are probably more exotic ways to do accomplish this in DATA Step but, in general, to combine information from multiple input rows onto a single output row, we use the RETAIN statement and/or the to the LAG / LAGx function. So, what are the corresponding statements in EEL? Well, there aren’t corresponding statements per se. Let’s just say that EEL works a little differently and look at an example.

 

Say we wanted to sum the Sales column by the Region column on the SASHELP.SHOES dataset. With Data Step this process would look something like this:

data Region_Sales;
set sashelp.shoes;
keep Region Sales_Total;
by region;
retain Sales_Total;
if first.region then Sales_Total = 0;
Sales_Total = Sales_Total + Sales;
if last.region then Output;

run;

 

The new field, Sales_Total, is retained and used to sum the sales data across the records. The FIRST. and LAST. field modifiers are used to reset the sum for each new Region and output the Region Sales totals.

 

So, as we said last time, what would this look like in EEL? Well, again it looks pretty similar but the pieces are spread out among many different locations. Let’s hunt them all down....

 

Pre-Processing Expression Tab

real Sales_Total

 

Expression Tab

Sales_Total = Sales_Total + Sales False

 

Group Fields Tab

Region

 

Group Pre-Expression Tab

Sales_Total = 0

 

Group Post-Expression Tab

Pushrow()

 

As we discussed in the previous article, the new variable, Sales_Total, is defined in the Pre-Processing Expression Tab. The logic to sum the sales data is done in the Expression Tab. Remember that we need to add the statement, False, to stop DataFlux from outputting one record for every input record. (We only want the summary records).

 

Now, let's look at the new tabs....

 

The Group Field Tab performs the same function as the BY statement in the DATA STEP -- Telling DataFlux to group the incoming records by Region.

 

The Group Pre-Processing Tab performs the same function as the FIRST. logic in the DATA STEP -- Resetting the Sales_Total to 0 for each Region.

 

The Group Post-Processing Tab performs the same function as the LAST. logic in the DATA STEP -- Outputting the total record for each Region.

 

So there it is. The same stuff but just moved all over the place.

 

Happy coding!

 

Contributors
Your turn
Sign In!

Want to write an article? Sign in with your profile.


Looking for the Ask the Expert series? Find it in its new home: communities.sas.com/askexpert.