Hello, something I'm having trouble understanding is when order matters. Please help verify, correct or add feedback to help my understanding.
WHERE: not sure about this one. it seems like we can't use where on newly created variables, so in that sense it doesn't matter because variables get flagged and dropped during execution? or does order make the processing more efficient?
FORMAT: does not matter where the statement is used because it does not actually alter the data itself but changes the way it's presented? But what if we start summarizing or printing reports on formatted data? Whether it's something as simple as rounding or adding a dollar sign to something more transformational like subsetting ranges of values into categorial groups, does this matter? Or is this why data steps are separated from proc steps?
DROP/KEEP: does not matter the order because variables are simply flagged and dropped upon execution. But if we use these as options during a set statement then it could have an effect.
For length, I get that we must assign the variable length before the variable is encountered in order to prevent truncation of length for other values, but what I don't understand is why if we add the length statement after assignment, why in some cases the step will still run while in others it won't. For example, a length statement is added after a character variable is assigned, and if the specification is numeric then there is an error but if the specification is character then there is no error but of course nothing happens.
Any other tips? Thank you.