When we last looked at this topic, we looked at how we can leverage our SAS DATA Step programming skills to build DataFlux EEL code. Last time we looked at basic field manipulation as well as creating multiple output rows from single input rows.
Let’s continue down this path and look at combining data from multiple input rows into a single output row. There are probably more exotic ways to do accomplish this in DATA Step but, in general, to combine information from multiple input rows onto a single output row, we use the RETAIN statement and/or the to the LAG / LAGx function. So, what are the corresponding statements in EEL? Well, there aren’t corresponding statements per se. Let’s just say that EEL works a little differently and look at an example.
Say we wanted to sum the Sales column by the Region column on the SASHELP.SHOES dataset. With Data Step this process would look something like this:
data Region_Sales;
set sashelp.shoes;
keep Region Sales_Total;
by region;
retain Sales_Total;
if first.region then Sales_Total = 0;
Sales_Total = Sales_Total + Sales;
if last.region then Output;
run;
The new field, Sales_Total, is retained and used to sum the sales data across the records. The FIRST. and LAST. field modifiers are used to reset the sum for each new Region and output the Region Sales totals.
So, as we said last time, what would this look like in EEL? Well, again it looks pretty similar but the pieces are spread out among many different locations. Let’s hunt them all down....
real Sales_Total
Sales_Total = Sales_Total + Sales False
Region
Sales_Total = 0
Pushrow()
As we discussed in the previous article, the new variable, Sales_Total, is defined in the Pre-Processing Expression Tab. The logic to sum the sales data is done in the Expression Tab. Remember that we need to add the statement, False, to stop DataFlux from outputting one record for every input record. (We only want the summary records).
Now, let's look at the new tabs....
The Group Field Tab performs the same function as the BY statement in the DATA STEP -- Telling DataFlux to group the incoming records by Region.
The Group Pre-Processing Tab performs the same function as the FIRST. logic in the DATA STEP -- Resetting the Sales_Total to 0 for each Region.
The Group Post-Processing Tab performs the same function as the LAST. logic in the DATA STEP -- Outputting the total record for each Region.
So there it is. The same stuff but just moved all over the place.
Happy coding!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.