The following code imports a csv file and selects the unique values for each element of VList into macrovariables Var1, Var2 etc. Then it creates a new table for each unique combination of the values in Var1 Var2 from the original CSV file.
I've written the code so it works with 2 elements in VList. But i need it to work for a varying number of elements.
Right now i have 2 distinct macros to loop through each of the elements of Var1 and Var2 (Loop_A, Loop_B), and the macro Combo to create the new table from the distinct combinations. If the number of elements in VList varies, then i would need a varying number of Loop_X's, and Combo would need a varying number of arguments.
Any suggestions on how to accomplish this?
%LET InFile=C:\FilePath\Filename.CSV;
%LET VList=Variable1|Variable2;
%LET VNum=%SYSFUNC(CountW(&VList.));
%PUT VNUM=&VNum.;
proc import datafile="&Infile."
DBMS=csv out=CSV_IN replace;
guessingrows=1000;
run;
%MACRO MVars(VarList);
%DO K=1 %TO &VNum.;
%GLOBAL V&K.;
*creates V1 as first variable in the VList, V2 as the 2nd...etc;
%LET V&K.= %SYSFUNC(SCAN(&VarList.,&K.,|));
%PUT V&K.= &&V&K.;
*Sorts CSV File by V#;
proc sort data=CSV_IN;
by &&V&K.;
run;
*creates a table the same name as the variable, and populates it with the first obs of each distinct input for that variable;
data &&V&K.;
set CSV_IN;
by &&V&K;
if first.&&V&K. then
output;
run;
%GLOBAL Var&K.;
*creates macrovariables Var1, Var2 etc. as the distinct values for V1, V2...;
proc sql noprint;
select &&V&K. into :Var&K.
separated by '|'
from &&V&K.;
quit;
%PUT Var&K. = &&Var&K.;
%END;
%MEND MVars;
%MVars(&VList.);
*Creates MacroVariables Length_1, Length_2 etc, as the number of items in each var#.;
%MACRO LengthOfVars();
%DO L = 1 %TO &VNum.;
%GLOBAL Length_&L.;
%LET Length_&L. = %SYSFUNC(CountW(&&Var&L.,|));
%PUT Length_&L. = &&Length_&L.;
%END;
%MEND LengthOfVars;
%LengthofVars();
*Loop_A and Loop_B cycle through every combination of the distinct values for each var#;
%MACRO LOOP_A();
%DO A = 1 %TO &Length_1.;
%LET AA = %SCAN(&Var1.,&A.,|);
%PUT AA=&AA.;
%LOOP_B;
%END;
%MEND LOOP_A;
%MACRO LOOP_B();
%DO B = 1 %TO &Length_2.;
%LET BB = %SCAN(&Var2.,&B.,|);
%PUT BB=&BB.;
%COMBO(&AA.,&BB.);
%END;
%MEND LOOP_B;
*create a new table for each unique combination of variable values;
%MACRO Combo(FirstVar, SecondVar);
Proc sql noprint;
create table NewTable_&FirstVar._&SecondVar.
as select * from CSV_IN
where &V1.="&FirstVar." and &V2.=&SecondVar.;
quit;
%PUT &V1. &FirstVar. &V2. &SecondVar.;
%MEND COMBO;
%LOOP_A;
It is hard to see the forest here. Can you explain what code you are trying to use the macro language to generate?
You seem to be saying you want this to be based on the values of two VARIABLES (not macro variables).
You can use any number of methods to get the distinct set of values of two variables. For example you could use PROC FREQ or PROC SORT or PROC SQL.
proc freq data=sashelp.cars ;
tables origin*type / noprint missing out=combo;
run;
proc sort data=sashelp.cars(keep=origin type) out=combo nodupkey;
by origin type;
run;
proc sql;
create table combo as select distinct origin,type from sashelp.cars;
quit;
Once you have this list of combinations you can then use it to drive the generation of whatever code you want.
So what code would you want to generate for the first combination? The second?
Get this list of combinations. PROC SORT is handy. Use the list to write code. An efficient way to write code for splitting a dataset is to use a single data step with multiple output statements.
data out1 out2 out3 ;
set in;
if condition1 then output out1;
else if condition2 then output out2;
else output out3 ;
run;
So here is one way. This macro assumes that the variables are all character. I will leave it as an exercise to see how to adapt the code to allow numeric variables and/or combinations of many types.
%macro split(indsn,varlist,basename=split);
%local i var sep;
proc sort data=&indsn(keep=&varlist) out=combos nodupkey;
by &varlist;
run;
filename code temp;
data _null_;
file code lrecl=75 ;
put 'data ' @;
do _n_=1 by 1 until(eof1);
set combos end=eof1 ;
put "&basename" _n_ @ ;
end;
put ';' / " set &indsn;" ;
do _n_=1 by 1 until(eof2);
set combos end=eof2 ;
if _n_ > 1 then put 'else ' @;
put 'if ('
%do i=1 %to %sysfunc(countw(&varlist));
%let var=%scan(&varlist,&i);
&sep &var= :$quote.
%let sep='and ';
%end;
") then output &basename" _n_ ';'
;
end;
put 'run;';
stop;
run;
%include code ;
%mend split;
Example run:
335 options mprint; 336 %split(sashelp.cars,origin type) MPRINT(SPLIT): proc sort data=sashelp.cars(keep=origin type) out=combos nodupkey; MPRINT(SPLIT): by origin type; MPRINT(SPLIT): run; NOTE: There were 428 observations read from the data set SASHELP.CARS. NOTE: 413 observations with duplicate key values were deleted. NOTE: The data set WORK.COMBOS has 15 observations and 2 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.01 seconds cpu time 0.00 seconds MPRINT(SPLIT): filename code temp; MPRINT(SPLIT): data _null_; MPRINT(SPLIT): file code lrecl=75 ; MPRINT(SPLIT): put 'data ' @; MPRINT(SPLIT): do _n_=1 by 1 until(eof1); MPRINT(SPLIT): set combos end=eof1 ; MPRINT(SPLIT): put "split" _n_ @ ; MPRINT(SPLIT): end; MPRINT(SPLIT): put ';' / " set sashelp.cars;" ; MPRINT(SPLIT): do _n_=1 by 1 until(eof2); MPRINT(SPLIT): set combos end=eof2 ; MPRINT(SPLIT): if _n_ > 1 then put 'else ' @; MPRINT(SPLIT): put 'if (' origin= :$quote. 'and ' type= :$quote. ") then output split" _n_ ';' ; MPRINT(SPLIT): end; MPRINT(SPLIT): put 'run;'; MPRINT(SPLIT): stop; MPRINT(SPLIT): run; NOTE: The file CODE is: Filename=C:\Users\...\#LN00063, RECFM=V,LRECL=75,File Size (bytes)=0, Last Modified=24Nov2020:12:14:19, Create Time=24Nov2020:12:14:19 NOTE: 19 records were written to the file CODE. The minimum record length was 4. The maximum record length was 75. NOTE: There were 15 observations read from the data set WORK.COMBOS. NOTE: There were 15 observations read from the data set WORK.COMBOS. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds MPRINT(SPLIT): data split1 split2 split3 split4 split5 split6 split7 split8 split9 split10 split11 split12 split13 split14 split15 ; MPRINT(SPLIT): set sashelp.cars; MPRINT(SPLIT): if (Origin="Asia" and Type="Hybrid" ) then output split1 ; MPRINT(SPLIT): else if (Origin="Asia" and Type="SUV" ) then output split2 ; MPRINT(SPLIT): else if (Origin="Asia" and Type="Sedan" ) then output split3 ; MPRINT(SPLIT): else if (Origin="Asia" and Type="Sports" ) then output split4 ; MPRINT(SPLIT): else if (Origin="Asia" and Type="Truck" ) then output split5 ; MPRINT(SPLIT): else if (Origin="Asia" and Type="Wagon" ) then output split6 ; MPRINT(SPLIT): else if (Origin="Europe" and Type="SUV" ) then output split7 ; MPRINT(SPLIT): else if (Origin="Europe" and Type="Sedan" ) then output split8 ; MPRINT(SPLIT): else if (Origin="Europe" and Type="Sports" ) then output split9 ; MPRINT(SPLIT): else if (Origin="Europe" and Type="Wagon" ) then output split10 ; MPRINT(SPLIT): else if (Origin="USA" and Type="SUV" ) then output split11 ; MPRINT(SPLIT): else if (Origin="USA" and Type="Sedan" ) then output split12 ; MPRINT(SPLIT): else if (Origin="USA" and Type="Sports" ) then output split13 ; MPRINT(SPLIT): else if (Origin="USA" and Type="Truck" ) then output split14 ; MPRINT(SPLIT): else if (Origin="USA" and Type="Wagon" ) then output split15 ; MPRINT(SPLIT): run; NOTE: There were 428 observations read from the data set SASHELP.CARS. NOTE: The data set WORK.SPLIT1 has 3 observations and 15 variables. NOTE: The data set WORK.SPLIT2 has 25 observations and 15 variables. NOTE: The data set WORK.SPLIT3 has 94 observations and 15 variables. NOTE: The data set WORK.SPLIT4 has 17 observations and 15 variables. NOTE: The data set WORK.SPLIT5 has 8 observations and 15 variables. NOTE: The data set WORK.SPLIT6 has 11 observations and 15 variables. NOTE: The data set WORK.SPLIT7 has 10 observations and 15 variables. NOTE: The data set WORK.SPLIT8 has 78 observations and 15 variables. NOTE: The data set WORK.SPLIT9 has 23 observations and 15 variables. NOTE: The data set WORK.SPLIT10 has 12 observations and 15 variables. NOTE: The data set WORK.SPLIT11 has 25 observations and 15 variables. NOTE: The data set WORK.SPLIT12 has 90 observations and 15 variables. NOTE: The data set WORK.SPLIT13 has 9 observations and 15 variables. NOTE: The data set WORK.SPLIT14 has 16 observations and 15 variables. NOTE: The data set WORK.SPLIT15 has 7 observations and 15 variables. NOTE: DATA statement used (Total process time): real time 0.09 seconds cpu time 0.09 seconds
In
%LET VList=Variable1|Variable2;
where are you getting these Variable1 & Variable2 from? If it's from a SAS table, perhaps consider using dictionary.columns with PROC SQL to create a macro variable with a dynamic list of columns separated by an appropriate delimiter. Then, use the &sqlobs instead of &VNUM as the ending point of a %DO loop (This will replace both 2nd and 3rd %let statements in your code as well).
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Follow along as SAS’ Robert Blanchard explains three aspects of autotuning in a deep learning context: globalized search, localized search and an in parallel method using SAS.
Find more tutorials on the SAS Users YouTube channel.