Hello,
I want to create a macro and find distributions by different mix of variables.
In some cases I want to calculated distribution by one variable.
In other cases I want to calculated distribution by two variables.
Also, in some cases I want to calculate distribution by a grouped variables (using proc format to define group of values).
The calculation should includes 2 measurements:
1-number of rows (freq)
2-Percent from Total
My question:
What is the way to create this macro that can work well in all the cases I mentioned.(one/two explanatory vars and also option to group values of var).
As you can see in my example there is an error in the 2nd and 3rd run because there is only one variable and in the macro there are 2 variables.
I need more flexible macro that can work well in all these cases.
%macro BBB(mon,sourceTbl,TargetTbl,CATVar,Fmt_Cat_Var,CATVar2,Fmt_Cat_Var2);
proc sql;
create table &TargetTbl. as
select put( &CATVar.,&Fmt_Cat_Var..) as &CATVar.,
	   put( &CATVar2.,&Fmt_Cat_Var2..) as &CATVar2.,
       count(*) as nr format=comma21.,
       calculated nr/(select count(*) as total from  &sourceTbl.) as PCT format=percent8.1
from &sourceTbl.
group by calculated &CATVar.,calculated &CATVar2.
;
quit;
%mend ;
/*Distribution by grouped Cylinders+MPG_City***/
/*Distribution by grouped Cylinders+MPG_City***/
/*Distribution by grouped Cylinders+MPG_City***/
proc format;
value F1mt
.='Uknown'
1-4='1-4'
5-high='5+'
;
Run;
%BBB(sourceTbl=sashelp.cars,
TargetTbl=wanted,
CATVar=Cylinders,
Fmt_Cat_Var=F1mt,
CATVar2=MPG_City,
Fmt_Cat_Var2=best)
/*Distribution by field Origin***/
/*Distribution by field Origin***/
/*Distribution by field Origin***/
%BBB(sourceTbl=sashelp.cars,
TargetTbl=wanted,
CATVar=Origin,
Fmt_Cat_Var=$21.)
 
/*Distribution by  grouped Cylinders***/
/*Distribution by  grouped Cylinders***/
/*Distribution by  grouped Cylinders***/
%BBB(sourceTbl=sashelp.cars,
TargetTbl=wanted,
CATVar=Cylinders,
Fmt_Cat_Var=F1mt)Why do you use a plus sign as delimiter in the %SCAN functions, when you dont't use it in COUNTW and don't supply it in the macro call?
Simplify your code:
%let nr_vars=%sysfunc(countw(&list1.));
%put &nr_vars.;
proc sql;
create table &TargetTbl. as
select
%do j = 1 %to &nr_vars.;
  put(%scan(&list1.,&j.),%scan(&list2.,&j.).) as %scan(&list1.,&j.),
%end;
  count(*) as nr format=comma21.,
  calculated nr/(select count(*) as total from  &sourceTbl.) as PCT format=percent8.1
from &sourceTbl.
%if &nr_vars. > 0
%then %do;
group by 
  %do j = 1 %to &nr_vars.;
  %if &j. > 1 %then ,;
  calculated %scan(&list.,&j.)
  %end;
%end;
;
quit;Now it should work with an arbitrary number of variable/format pairs, even 0 won't crash it.
Untested, posted from my tablet.
Instead of separate parameters, allow blank-separated lists of variables and formats, and use %DO loops to create the SQL code.
Thank you, May you please show the desired code?
This line fails if there is only one variable specified:
put( &CATVar2.,&Fmt_Cat_Var2..) as &CATVar2.,So you could use this:
%if &catvar2 ne %then put( &CATVar2.,&Fmt_Cat_Var2..) as &CATVar2.,;and you would have to similarly modify other parts of the macro
Still get an error,
I need to overcome the situation when I define 2 macro vars in the macro but use only one.May you please show the code to solve it?
 
%macro BBB(mon,sourceTbl,TargetTbl,CATVar,Fmt_Cat_Var,CATVar2,Fmt_Cat_Var2);
proc sql;
create table &TargetTbl. as
select put( &CATVar.,&Fmt_Cat_Var..) as &CATVar.,
%if &catvar2 ne %then put( &CATVar2.,&Fmt_Cat_Var2..) as &CATVar2.,;
       count(*) as nr format=comma21.,
       calculated nr/(select count(*) as total from  &sourceTbl.) as PCT format=percent8.1
from &sourceTbl.
group by calculated &CATVar.,calculated &CATVar2.
;
quit;
%mend ;
/*Distribution by grouped Cylinders+MPG_City***/
/*Distribution by grouped Cylinders+MPG_City***/
/*Distribution by grouped Cylinders+MPG_City***/
proc format;
value F1mt
.='Uknown'
1-4='1-4'
5-high='5+'
;
Run;
/*Distribution by 2 vars: Cylinders + MPG_City ***/
%BBB(sourceTbl=sashelp.cars,
TargetTbl=wanted,
CATVar=Cylinders,
Fmt_Cat_Var=F1mt,
CATVar2=MPG_City,
Fmt_Cat_Var2=best)
/*Distribution by field Origin***/
/*Distribution by field Origin***/
/*Distribution by field Origin***/
%BBB(sourceTbl=sashelp.cars,
TargetTbl=wanted,
CATVar=Origin,
Fmt_Cat_Var=$21.)
 
/*Distribution by  grouped Cylinders***/
/*Distribution by  grouped Cylinders***/
/*Distribution by  grouped Cylinders***/
%BBB(sourceTbl=sashelp.cars,
TargetTbl=wanted,
CATVar=Cylinders,
Fmt_Cat_Var=F1mt)You have to modify other parts of the macro code with %IF statements, similar to what I showed. Please give it a try yourself. If you're not sure, you can run the code you have, the SAS log will tell where the errors are, those are the places where you need %IF.
By the way, you need to turn on
options mprint;whenever you are debugging macros.
I'm sure you know this, and so it is very annoying and unacceptable that I have to continue to point this out, but if you have code that isn't working ... SHOW US THE LOG
Imagine you have only catvar and fmt_cat_var as parameters.
%do i = 1 %to %sysfunc(countw(&catvar.));
  put(%scan(&catvar.,&i.),%scan(&fmt_cat_var.,&i.)) as %scan(&catvar.,&i.),
%end;You only have to make sure that you have the same number of items in the two parameters.
you may test this code.
proc format;
   value F1mt
    .='Unknown'
    1-4='1-4'
    5-high='5+'
   ;
run;
%macro BBB(keeplist=,fmtlist=,outds=);
%local totalvars next_var next_fmt;
%let totalvars = %sysfunc(countw(&keeplist));
%put &=totalvars;
data cars;
    set sashelp.cars;
    keep &keeplist.;
run;
%do i=1 %to &totalvars;
    %let next_var=%scan(&keeplist,&i);
    %let next_fmt=%scan(&fmtlist,&i);
    data temp&i;
        set cars;
        catvar&i=put(&next_var.,&next_fmt..);
        label catvar&i="&next_var";
    proc sort;
        by &keeplist;
    run;
    
%end;
%if &totalvars >1 %then %do;
data combined;
    merge temp: ;
    by &keeplist;
run;
proc sql noprint;
    select name into: name_list separated by '*'
        from dictionary.columns where upcase(libname)='WORK' and upcase(memname)='COMBINED'
        and name contains 'catvar';
quit;
%put &=name_list;
proc freq data=combined;
    tables &name_list / out=freq1;
run;
data &outds;
    set freq1;
run;
%end;
%if &totalvars=1 %then %do;
data combined;
    set temp1;
run;
proc freq data=combined;
    tables catvar1 / out=freq1;
run;
data &outds;
    set freq1;
run;
%end;
%mend;
%BBB(keeplist=cylinders mpg_city,fmtlist=%str(f1mt best),outds=wanted2);
%BBB(keeplist=origin,fmtlist=%str($21),outds=wanted3);I still want to learn do it with similar structure as I asked (proc sql).
May anyone show the code that solve the errors
options mprint;
%macro BBB(mon,sourceTbl,TargetTbl,list1= ,list2=);
%let nr_vars=%sysfunc(countw(&list1));
%put &nr_vars;
%DO J=1 %TO &nr_vars;
%let VAR&j.=%scan(&list1.,&j.,+);
%let Fmt&j.=%scan(&list2.,&j.,+);
/**Define Var1,VAR2,Fmt1,Fmt2 macro vars**/
%end;
proc sql;
create table &TargetTbl. as
select put( &Var1.,&Fmt1..) as &Var1.,
%if &catvar2 ne %then put( &VAR2.,&Fmt2..) as &VAR2.,;
       count(*) as nr format=comma21.,
       calculated nr/(select count(*) as total from  &sourceTbl.) as PCT format=percent8.1
from &sourceTbl.
group by calculated &Var1.,
%if &Var2 ne %then calculated &Var2.;
;
quit;
%mend ;
/*Distribution by grouped Cylinders+MPG_City***/
 proc format;
value Cylinder_fmt
.='Uknown'
1-4='1-4'
5-high='5+'
;
Run;
%BBB(sourceTbl=sashelp.cars,
TargetTbl=wanted,
list1= Cylinders MPG_City,
list2=Cylinder_fmt  best.)
/*Distribution by 1 var: Cylinders***/
%BBB(sourceTbl=sashelp.cars,
TargetTbl=wanted,
list1= Cylinders ,
list2=Cylinder_fmt)
is there a reason you don't want to use the do loop that @Kurt_Bremser suggested?
Why do you use a plus sign as delimiter in the %SCAN functions, when you dont't use it in COUNTW and don't supply it in the macro call?
Simplify your code:
%let nr_vars=%sysfunc(countw(&list1.));
%put &nr_vars.;
proc sql;
create table &TargetTbl. as
select
%do j = 1 %to &nr_vars.;
  put(%scan(&list1.,&j.),%scan(&list2.,&j.).) as %scan(&list1.,&j.),
%end;
  count(*) as nr format=comma21.,
  calculated nr/(select count(*) as total from  &sourceTbl.) as PCT format=percent8.1
from &sourceTbl.
%if &nr_vars. > 0
%then %do;
group by 
  %do j = 1 %to &nr_vars.;
  %if &j. > 1 %then ,;
  calculated %scan(&list.,&j.)
  %end;
%end;
;
quit;Now it should work with an arbitrary number of variable/format pairs, even 0 won't crash it.
Untested, posted from my tablet.
I don't think that will work. You don't want to use cylinders and mpg_city. You want to use the new formatted versions of these variables. if you use unformatted, that will get 47 rows. should be 35 rows if they are formatted.
You are right. I missed the CALCULATED keyword in the GROUP BY. Have corrected it
Great and thanks,
Only one small issue left.
When the explanatory variable is char and we don't group it then I get an error.
ERROR: Numeric format F in PUT function requires a numeric argument.
What is the way to solve it please?
%macro BBB(sourceTbl,TargetTbl,list1=,list2=);
%let nr_vars=%sysfunc(countw(&list1.));
%put &nr_vars.;
proc sql;
create table &TargetTbl. as
select
%do j = 1 %to &nr_vars.;
  put(%scan(&list1.,&j.),%scan(&list2.,&j.).) as %scan(&list1.,&j.),
%end;
  count(*) as nr format=comma21.,
  calculated nr/(select count(*) as total from  &sourceTbl.) as PCT format=percent8.1
from &sourceTbl.
%if &nr_vars. > 0
%then %do;
group by 
  %do j = 1 %to &nr_vars.;
  %if &j. > 1 %then ,;
  calculated %scan(&list1.,&j.)
  %end;
%end;
;
quit;
%mend ;
/*Distribution by field Origin-Get error Here***/
/*Distribution by field Origin-Get error Here***/
/*Distribution by field Origin-Get error Here***/
%BBB(sourceTbl=sashelp.cars,
TargetTbl=wanted,
list1=Origin,
list2=$32)
/*ERROR: Numeric format F in PUT function requires a numeric argument.*/
/*Distribution by grouped Cylinders+MPG_City-Working well***/
/*Distribution by grouped Cylinders+MPG_City-Working well***/
/*Distribution by grouped Cylinders+MPG_City-Working well***/
proc format;
value F1mt
.='Uknown'
1-4='1-4'
5-high='5+'
;
Run;
%BBB(sourceTbl=sashelp.cars,
TargetTbl=wanted,
list1=Cylinders MPG_City,
list2=F1mt  best)
/*Distribution by grouped Cylinders+grouped MPG_City-Working well***/
/*Distribution by grouped Cylinders+grouped MPG_City-Working well***/
/*Distribution by grouped Cylinders+grouped MPG_City-Working well***/
proc format;
value F1mt
.='Uknown'
1-4='1-4'
5-high='5+'
;
Run;
proc format;
value F2mt
.='Uknown'
low-25='Ad 25'
25-high='25+'
;
Run;
%BBB(sourceTbl=sashelp.cars,
TargetTbl=wanted,
list1=Cylinders MPG_City,
list2=F1mt  F2mt)
/*Distribution by  grouped Cylinders-Working well***/
/*Distribution by  grouped Cylinders-Working well***/
/*Distribution by  grouped Cylinders-Working well***/
%BBB(sourceTbl=sashelp.cars,
TargetTbl=wanted,
list1=Cylinders,
list2=F1mt)
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
