07-05-2017 09:09 AM
I want to output different data sets within a macro depending on the number categories of the categorical variable in use. For example, let’s say if I have the variable income with 4 categories[-, 50 000),(50 000, 80 000],(80 000, 120 000], (120 000, +0).
The code I have for this is the following:
data data_50 data_50to80 data._80to120 data120;
if &var = '[-, 50 000)' then output data_50;
else if &var = '(50 000, 80 000]' then output data_50to80;
else if &var = '(80 000, 120 000]' then output data_80to120;
else if &var = '(120 000, +]' then output data_120;
where &var is a macro variable I created for the macro (&var=income). I want to be able to perform the previous data step with other categorical variables with different number categories. For example: score (low, medium, high). After I have those data sets, I want to be able to transpose them each one of them, and then ‘set’ them in a subsequent data set.
Could you please suggest me how to do so?
Thanks a lot
07-05-2017 09:21 AM
I would suggest that its not a good idea to duplicate your data over and over again just to add categories, create category variables in the one dataset. In the example you have given, you can create a format, then apply that format into a new variable:
proc format; value grp1 0-50000="data_50" 50000-80000="data_80" ... ; run; data want; length cat1 $20; cat1=put(yourval,grp1.); run;
There are lots of little tricks like this using formats. So create one dataset, with different categories cat1 cat2 etc. for each categorisation. Then you can where or by group based on them.