I want to output multiple datasets from a single data step, but I want the "if" statement that determines which data set to which I output a given record to be determined from the value of a variable.
For example, I want to process data and loop through a set of MARKET_IDs, calculating various information for that market, and then output the data to a data set for each market.
I know how to do this using if-then-else constructs, but that takes a lot of processing time given that I have 9 million+ customer records that I want to output to 1500+ market data sets. I currently have this written using macro code, but here is a simplified version of what I am doing:
data market1 market2 market3;
set customer_data;
do market_id=1 to 3;
*do a bunch of calculations here;
if market_id=1 then output market1;
else if market_id=2 then output market2;
else if market_id=3 then output market3;
end;
run;
The problem is that with 1500+ output datasets, a given record has to go through on average 750 if-then checks before it finds its home. I would like something like this (which I know does not work because call symput only populates the variable at the end of the data step, not in the middle).
data market1 market2 market3;
set customer_data;
do market_id=1 to 3;
*do a bunch of calculations here;
call symput ('mkt_id',market_id);
output market&mkt_id.;
end;
run;
Is there any way to do this? It would be sort of like the indirect() function in Excel - referring to a dataset based on an indirect reference. I have tried call execute, macros, etc., and no luck.
... View more