Hi -
I am trying to create 45 datasets in one go. So below is the sub_channel that actually creates two macro variables. One(macro variable sc) to use it as a condition which is a 4 char variable(populated with 2, 3 or 4 char. So it has trailing spaces). and other (macro variable scds) to use it for the dataset creation. I tried using compress and strip but neither is working to remove the space when using in the IF clause below. what am I missing here when I am doing the output to the dataset name? A sample of which dataset it needs to write is commented in the below code.Else IF would be apt if it works.
proc sql ;
select strip(sub_channel), strip(sub_channel)||"_loaded " into :sc, :scds separated by ' '
from sub_channel;
quit;
%macro sbg;
data &scds load;
set dsin;
%do i=1 %to %sysfunc(countw(%superq(sc),' '));
if compress(sub_channel)=compress(&sc.) then output &sc._loaded;
/*if compress(sub_channel)='OM' then output Om_loaded;
if compress(sub_channel)='SAHP' then output sahp_loaded;
if compress(sub_channel)='MHP' then output mhp_loaded;
*/
else output load;
%end;
run;
%mend;
%sbg;
Data sets to be created using
ALO | ALO_loaded |
ATM | ATM_loaded |
CAO | CAO_loaded |
CCMS | CCMS_loaded |
CMC | CMC_loaded |
DCC | DCC_loaded |
DM | DM_loaded |
DMS | DMS_loaded |
EM | EM_loaded |
HLMS | HLMS_loaded |
IVR | IVR_loaded |
LDR | LDR_loaded |
MCD | MCD_loaded |
MHP | MHP_loaded |
MSO | MSO_loaded |
MTA | MTA_loaded |
OM | OM_loaded |
Othe | Othe_loaded |
SAHP | SAHP_loaded |
SAS | SAS_loaded |
SFSB | SFSB_loaded |
SLR | SLR_loaded |
|
Error message:
SYMBOLGEN: Macro variable SC resolves to ALO
SYMBOLGEN: Macro variable SC resolves to ALO
NOTE: Line generated by the macro variable "SC".
120 ALO _loaded
___ _______
455 455
MPRINT(SBG): else if compress(sub_channel)=compress(ALO ) then output ALO _loaded;
SYMBOLGEN: Macro variable LIVEDATE resolves to 032419
MPRINT(SBG): else output load;
MLOGIC(SBG): %DO loop index variable I is now 2; loop will not iterate again.
MPRINT(SBG): run;
ERROR 455-185: Data set was not specified on the DATA statement.
@Tanvi99 wrote:
Hi -
I am trying to create 45 datasets in one go. So below is the sub_channel that actually creates two macro variables. One(macro variable sc) to use it as a condition which is a 4 char variable(populated with 2, 3 or 4 char. So it has trailing spaces). and other (macro variable scds) to use it for the dataset creation. I tried using compress and strip but neither is working to remove the space when using in the IF clause below. what am I missing here when I am doing the output to the dataset name? A sample of which dataset it needs to write is commented in the below code.Else IF would be apt if it works.
proc sql ;
select strip(sub_channel), strip(sub_channel)||"_loaded " into :sc, :scds separated by ' '
from sub_channel;
quit;
%macro sbg;
data &scds load;
set dsin;
%do i=1 %to %sysfunc(countw(%superq(sc),' '));
if compress(sub_channel)=compress(&sc.) then output &sc._loaded;
/*if compress(sub_channel)='OM' then output Om_loaded;
if compress(sub_channel)='SAHP' then output sahp_loaded;
if compress(sub_channel)='MHP' then output mhp_loaded;
*/
else output load;
%end;
run;
%mend;
%sbg;
Data sets to be created using
ALO ALO_loaded ATM ATM_loaded CAO CAO_loaded CCMS CCMS_loaded CMC CMC_loaded DCC DCC_loaded DM DM_loaded DMS DMS_loaded EM EM_loaded HLMS HLMS_loaded IVR IVR_loaded LDR LDR_loaded MCD MCD_loaded MHP MHP_loaded MSO MSO_loaded MTA MTA_loaded OM OM_loaded Othe Othe_loaded SAHP SAHP_loaded SAS SAS_loaded SFSB SFSB_loaded SLR SLR_loaded
Error message:
SYMBOLGEN: Macro variable SC resolves to ALO
SYMBOLGEN: Macro variable SC resolves to ALO
NOTE: Line generated by the macro variable "SC".
120 ALO _loaded
___ _______
455 455
MPRINT(SBG): else if compress(sub_channel)=compress(ALO ) then output ALO _loaded;
SYMBOLGEN: Macro variable LIVEDATE resolves to 032419
MPRINT(SBG): else output load;
MLOGIC(SBG): %DO loop index variable I is now 2; loop will not iterate again.
MPRINT(SBG): run;
ERROR 455-185: Data set was not specified on the DATA statement.
There may be other errors, however, as highlighted above in red, you are only looping through your loop one time.
To fix this, you want
proc sql ;
select strip(sub_channel), strip(sub_channel)||"_loaded " into :sc separated by ' ',
:scds separated by ' '
from sub_channel;
quit;
Now, another error, in my opinion, is that there is usually little value in splitting up data set DSIN into 45 different data sets. Not only do you have to do the work to split it up, but then you have to write macros to loop through any analysis of all the 45 data sets. This is often easily handled by using the BY feature of most SAS PROCs and the BY feature of the SAS data step.
So, the global solution, and probably the easiest solution, is to not do this (unless you have a very strong justification to do this).
the input dataset is 8MM records which I need to do a cross join on. It needs to iterate 45 times. so thought rather split into 45 datasets and do a cross join on it. any other input is greatly appreciated when handling such huge data.
When you use code like:
if compress(sub_channel)=compress(&sc.) then output &sc._loaded;
EVERY single value of &sc._loaded you are going to use needs to appear on the DATA statement.
If you look very closely at the MPRINT output you will see that
ALO _loaded;
has a space between ALO and _loaded. So that isn't going to be a name match.
exactly, it is not compressing the space. not sure why
@Tanvi99 wrote:
exactly, it is not compressing the space. not sure why
Your source variable contains spaces, and these end up in the macro variable. That's why we make use of trim() and strip() functions to get rid of these blanks.
By having your source dataset sorted along your sub_channel, you can do all your processing using "by" in one step.
If your analysis is more complex, let us know, but my data-step created split should work.
It would also be of great help if you supplied example data in usable, unambigious form (data step with datalines, see my code example).
First of all, such splitting of datasets is usually a sign of not really understanding how SAS works with regards to where conditions or by-group-processing.
You put a series of variable names into &sc, and use the whole macro varaible in a place where you only need one item out of it.
It's much easier (IMO) to create the code directly from the dataset with call execute():
data sub_channel;
input sub_channel :$3.;
datalines;
ALO
ATM
CAO
;
run;
data _null_;
call execute('data ');
do until (eof1);
set sub_channel end=eof1;
call execute(strip(sub_channel) !! '_loaded ');
end;
call execute('; set dsin;');
do until (eof2);
set sub_channel end=eof2;
call execute(
"if strip(sub_channel)='" !!
strip(sub_channel) !!
"' then output " !!
strip(sub_channel) !!
'_loaded;'
);
end;
call execute("run;");
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.