In a data step, I have came across different places to put for instance your keep statement. Example , with Pseudo Code:
data MyTry(keep=ColA) ; *Alternative 1;
set SomeData(keep = ColA); *Alternative 2;
keep ColA; *Alternative 3;
run;
Putting the keep as in Alternative 2, often gives more efficient program due to reading in less data.
But what is the difference between Alternative 1 and Alternative 3? Pros and cons with the alternatives?
In my experience Alternative 1 is very rear (I have not seen it often).
Thanks.
#1 and #3 are very close, as they determine what is output, without affecting the presence of variables within the step.
#1 allows to be selective (if you have more than one dataset in the DATA statement, you can control the variables individually).
#2 filters what goes into the data step, so the other variables contained in the input dataset will not be present during the step.
#1 and #3 are very close, as they determine what is output, without affecting the presence of variables within the step.
#1 allows to be selective (if you have more than one dataset in the DATA statement, you can control the variables individually).
#2 filters what goes into the data step, so the other variables contained in the input dataset will not be present during the step.
PS this
keep ColA; *Alternative 3;
is a statement, the others are dataset options.
For practical purposes, you know all you really need to know.
Complications arise if you use both tools. For example, would these two programs get the same results or different results?
data want (drop=name);
set have;
keep name;
run;
data want (keep=name);
set have;
drop name;
run;
The complications get compounded if you use RENAME in one spot, but KEEP or DROP in the other spot. Should KEEP (or DROP) refer to the original name or to the new name?
There are rules about that, but rather than memorize them just avoid the situation.
The one rule I remember is that dataset options are processed in alphabetical order.
DROP/KEEP
RENAME
WHERE
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.