In base SAS, I have occasionally the need to create multiple datasets where a particular variable's value determines (in part or in whole) the output dataset. IE:
data class_m class_f;
if sex='M' then output class_m;
if sex='F' then output class_f;
While the above example is trivial, a larger number of datasets leads to messy code. I propose an output statement or call routine that takes an expression for the
output dataset name, similar to the following:
This is similar to the IML output statement.
When I brought this up on SAS_L, some specific suggestions were made for dealing with the obvious risk that a dataset not exist (as I'm not proposing allowing the user
to specify datasets that are not specified on the data statement, which while that would be nice would undoubtedly be much more complicated).
One option is to make it a call routine, with an optional second parameter for an "other" dataset that any rows are output to that do not have valid values in the first parameter.
This "other" dataset must also be specified on the data statement.
data class_m class_f class_other;
Then the data step would output to cats("Class_",sex) if that dataset exists on the data statement; if it does not exist, and the second argument is present and a valid
dataset from the data statement, it outputs to that row; if neither are present, it may error or give a warning (perhaps with an option similar to DKROCOND/DKRICOND to specify which).
You may also use "_NULL_" to specify silently ignoring rows that do not have a valid dataset in the expression and/or have a third optional argument to specify
what to do (error, warn, or silently ignore).
Not sure I understand why you would want to create lots of datasets. By group processing is very fast in SAS, as categorical processing in SQL is also very good. So why split the data up and then do optimized functions on large numbers of datasets, just do the optimized functions once on one large dataset with by group processing.
I agree with RW9:
There has to be a large advantage in order to consider doing this.
This ia FAQ on SAS-L and some answers have been harvested.
This page has several solutions, including the obvious: hashing.
Split Data into Subsets - sasCommunity
Ron Fehd list processing maven
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.