One of my current tasks is to convert a hand-written EG project into a manageable DI solution: one that can be scheduled and maintained and so on.
I came across a weird bit of code:
proc sort data=source out=target;
by acct_date descending prev_balance;
by account_id posting_dt Transfer_Order descending prev_balance;
run;
Oh - that looks a bit weird. I wonder what it does - does it use the first by statement, the second, or concatenate them together?
With a bit of expermentation, I found that it only uses the last one it sees. With the data contents being what they were (don't ask), it made little difference as to which statement was being used, but the last one was preferable. I suspect when the code was written, the second line was put in and the first line left there; it compiled and executed and didn't produce an error, so hey-ho.
So then I thought I should experiment. What I came up with is that (not exhaustively): procedures don't care about multiple by statements and use the last; but data steps produce:
ERROR 221-185: More than one BY statement specified for a SET, MERGE, UPDATE, or MODIFY statement.
So there you go. Novel, of no use whatsoever (I mean, why would you?!), but it afforded a little amusement/bemusement in the office today. It's interesting to infer what's going on under the parser's hood.
And back to work.
IMO, such a misuse of the by statement should at least leave a NOTE in the log. I'd even like to get a WARNING, as it might point to a programming error.
I believe most of the procedures will use the LAST entered of statements that should only occur once such as BY.
I do agreen with @Kurt_Bremser that a note or warning is appropriate.
For example a recent post:
https://communities.sas.com/t5/SAS-GRAPH-and-ODS-Graphics/SGPLOT/m-p/391856
2 yaxis statements in Proc SGPLOT but only the last one was applied.
I see this a possibly someone marginally careless about not commenting out one statement to get the other result.
This is an obvious example of the difference between compiled and executed statements.
The 2nd statement 'replaces' the first during the setup of the procedure.
A friendly tech support person once replied to my similar question about
options in the configuration file:
"options are 'loaded', last one is what is used".
Same action here: statements are 'stacked' into a set of instructions
that are first compiled, then executed.
Ron Fehd which came first? maven
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.