DATA Step, Macro, Functions and more

Summing all numeric variables (flexible names and amount of variables)

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

Summing all numeric variables (flexible names and amount of variables)

Hello, everybody!

I have a program which generates a data set with one character variable and several numeric variables from data values with proc transpose. After that I need to add a variable to my data set which is equal to the sum of all these generated variables. The problem is that the names and amount of these variables are always different and I cant't get how to make SAS sum them Smiley Happy I was googling all the evening long but haven't found the solution Smiley Sad

Your help would be highly appreciated!

P.S. This is my first post here, so, please, be patient if I I posted my question in the wrong place or if the solution seems obvious for you, I am just learning SAS Smiley Happy


Accepted Solutions
Solution
‎10-11-2011 04:51 PM
SAS Employee
Posts: 104

Re: Summing all numeric variables (flexible names and amount of variables)

A more sophisticated macro application with extensive parameter checking would provide protection against this and more...  I thought I'd keep it simple as the original poster stated he was just learning SAS, and I didn't want to over complicate the issue.  However, something like this would be a good start towards a more robust app:

 
%macro SumNums(DSN_In,DSN_Out, Sumvar);
%local Varnames LIB DSN;
  /* Did the user provide an input dataset name? */
%if &DSN_In= %then %do;
   %put ERROR: (SumNums) You must provide the name of the data set you wish to process.;
   %SyntaxHelp:
   %put NOTE- SYNTAX:  %NRSTR(%%SumNums%(DSN_In,DSN_Out,Sumvar%));
   %put NOTE-          DSN_In =Name of data set to process;
   %put NOTE-          DSN_Out=Name of output data set;
   %put NOTE-          Sumvar =Variable name to hold the sum (default is Total);
   %put NOTE- Example: %NRSTR(%%SumNums%(sashelp.class,work.test%));
   %GoTo EndMacro;
%end;
%let DSN_In=%QUPCASE(&DSN_In);

  /* Did the user ask for syntax help? */
%if &DSN_In=!HELP %then %do;
   %GoTo SyntaxHelp;
%end;
  /* Did the user provide an output dataset name? */
%if &DSN_Out= %then %do;
   %put ERROR: (SumNums) You must provide the name of the output data set.;
   %GoTo SyntaxHelp;
%end;
%let DSN_Out=%QUPCASE(&DSN_Out);
  /* Is the sum variable given a valid SAS variable name? */
%if &sumvar= %then %let Sumvar=Total;
%if not %sysfunc(nvalid(%superq(Sumvar))) %then %do;
   %put ERROR: (SumNums) %superq(Sumvar) is not a valid SAS variable name.;
   %GoTo SyntaxHelp;
%end;

%let DSN=%QSCAN(&DSN_In,2,.);
%if &DSN= %then %do;
   %let DSN=%SUPERQ(DSN_In);
   %let LIB=WORK;
%end;
%else %let LIB=%QSCAN(&DSN_In,1,.);

proc sql noprint;
select name into :Varnames separated by ' '
  from dictionary.columns
  where LIBNAME="&LIB" 
    and MEMNAME="&DSN"
    and TYPE='num'
;
quit;

  /* Were there any numeric variables in the input data set? */
%if &SQLOBS=0 %then %do;
   %put WARNING: (SumNums) dataset &DSN_In contained no numeric variables.;
   %GoTo EndMacro;
%end;
  /* Do the deed! */
data &DSN_out;
   set &DSN_In;
   &Sumvar=sum(of &varnames);
run;

%EndMacro:
%Mend SumNums;

After compiling, you just call the macro to do the deed.

Example of a call for SYNTAX help:

%SumNums(!HELP)

Example of a call to calculate the total for SASHELP.CLASS:

%SumNums(sashelp.class,work.test,MySumVar)

View solution in original post


All Replies
Super User
Super User
Posts: 7,076

Summing all numeric variables (flexible names and amount of variables)

data want;

set have;

  total = sum(of _numeric_);

run;

Respected Advisor
Posts: 3,799

Re: Summing all numeric variables (flexible names and amount of variables)

Tom wrote:

data want;

set have;

  total = sum(of _numeric_);

run;


It should be noted that with this syntax TOTAL will also appear on the RIGHT and will be summed.  Of course in this example the value is always missing when the sums take place so all is well.

However, there could be problems with other functions particularly NMISS.  It might be prudent to make a more carefully defined list.

I like to use arrays to create lists where I have more control of the elements, consider this example.  Using the unexecuted SET is unnecessary but illustrates a more general solution where the list is derived in a more complex way.

data class;

   if 0 then set sashelp.class(keep=_numeric_);

   array v

  • _numeric_;

       set sashelp.class;

       nmiss1 = nmiss(of _numeric_);

       nmiss2 = nmiss(of v

  • );
  •    sum    = sum(of v

  • );
  •    n      =   n(of v

  • );
  •    run;

    proc print;

       run;

    Obs    Age    Height    Weight    Name       Sex    nmiss1    nmiss2     sum     n

      1     14     69.0      112.5    Alfred      M        1         0      195.5    3

      2     13     56.5       84.0    Alice       F        1         0      153.5    3

      3     13     65.3       98.0    Barbara     F        1         0      176.3    3

      4     14     62.8      102.5    Carol       F        1         0      179.3    3

      5     14     63.5      102.5    Henry       M        1         0      180.0    3

      6     12     57.3       83.0    James       M        1         0      152.3    3

      7     12     59.8       84.5    Jane        F        1         0      156.3    3

      8     15     62.5      112.5    Janet       F        1         0      190.0    3

      9     13     62.5       84.0    Jeffrey     M        1         0      159.5    3

    10     12     59.0       99.5    John        M        1         0      170.5    3

    11     11     51.3       50.5    Joyce       F        1         0      112.8    3

    12     14     64.3       90.0    Judy        F        1         0      168.3    3

    13     12     56.3       77.0    Louise      F        1         0      145.3    3

    14     15     66.5      112.0    Mary        F        1         0      193.5    3

    15     16     72.0      150.0    Philip      M        1         0      238.0    3

    16     12     64.8      128.0    Robert      M        1         0      204.8    3

    17     15     67.0      133.0    Ronald      M        1         0      215.0    3

    18     11     57.5       85.0    Thomas      M        1         0      153.5    3

    19     15     66.5      112.0    William     M        1         0      193.5    3

    
    					
    				
    			
    			
    				
    			
    			
    			
    			
    			
    			
    			
    		
    Valued Guide
    Posts: 2,177

    Summing all numeric variables (flexible names and amount of variables)

    Posted in reply to data_null__

    beware use of _numeric_ in a general context as a data set might bring in only _character_ variables.

    One (fairly) safe solution

    Array v(*) _n_ _numeric_ ;

    Should you wish to ignore that _n_ in do loops, just

    Do index = 2 to dim(v);

    As _n_ will always be numeric

    num_total = sum( of v(*))-_ n_;

    purists will point to the rare possibility that _n_ can become too large for integer precision and risk the num_total becoming incorrect.

    Peter

    Respected Advisor
    Posts: 3,799

    Summing all numeric variables (flexible names and amount of variables)

    Good point that _NUMERIC_ might be NULL and SAS will not like that.  However adding a variable to unsure the array has at least one element would not be compatible with the statistic functions in general.  Guess we can't have out cake and eat it too.:smileygrin: Unless we resort to code gen.

    Valued Guide
    Posts: 2,177

    Summing all numeric variables (flexible names and amount of variables)

    Posted in reply to data_null__

    not hard to test an array having only one element to avoid trying to produce those stats.

    In that kind of way maybe we should avoid producing a deviation of one, etc

    Respected Advisor
    Posts: 3,799

    Summing all numeric variables (flexible names and amount of variables)

    Peter.C wrote:

    not hard to test an array having only one element to avoid trying to produce those stats.

    In that kind of way maybe we should avoid producing a deviation of one, etc

    That's not the problem I was thinking about.  Your example uses _N_ which is OK I suppose but it should be set to missing or better still use a DUMMY numeric that is always missing.  Then you don't have to "correct" any of the results except NMISS or CMISS.  At least I don't think any of the other statistic functions would need correcting.

    SAS Employee
    Posts: 104

    Re: Summing all numeric variables (flexible names and amount of variables)

    This is a great place to use a little SQL and macro to more precisely controlwhich variables you sum up later.  For my example, I'll use the CLASS data set in the SASHELP library.

    1.  First execute this SQL code (which can be used as a template) to get a list of all the numeric variables in the target dataset, collected a macro variable named VARLIST:

    proc sql noprint;
    select name into :Varnames separated by ' '
      from dictionary.columns
      where LIBNAME='SASHELP' 
        and MEMNAME='CLASS'
        and TYPE='num';
    quit;

    2. Next, use the macro variable to write the list for you in the subsequent DATA step code:

    data want;
       set sashelp.class;
       total=sum(of &varnames);
    run;
    

    Partial results below:


    ObsNameSexAgeHeightWeighttotal
    1AlfredM1469.0112.5195.5
    2AliceF1356.584.0153.5
    3BarbaraF1365.398.0176.3
    4CarolF1462.8102.5179.3
    5HenryM1463.5102.5180.0
    Valued Guide
    Posts: 2,177

    Summing all numeric variables (flexible names and amount of variables)

    have you any protection against the source data set having no numeric variables?

    Super User
    Super User
    Posts: 7,076

    Re: Summing all numeric variables (flexible names and amount of variables)

    total = sum(of total &varnames);

    Remember add a %LET varnames=; statement BEFORE the SQL query so that any existing list of values is not retained in the macro variable.  PROC SQL will not modify the value of the target macro variable when no rows are selected.

    Of course this gets us back to the original reply.

    total=sum(of _numeric_);

    Trusted Advisor
    Posts: 1,301

    Summing all numeric variables (flexible names and amount of variables)

    %macro vnumlist(lib, sds, svar);

    %global g_vnumsum;

    proc sql noprint;

      select count(*) into :vnobs

        from sashelp.vcolumn

       where libname="%upcase(&lib)"

                   and memname="%upcase(&sds)"

                         and type='num';

    %if &vnobs>0 %then

      %do;

       select name into :vnames separated by ' '

         from sashelp.vcolumn

        where libname="%upcase(&lib)"

                    and memname="%upcase(&sds)"

                          and type='num';

       %let g_vnumsum=&svar=sum(of &vnames);

      %end;

    %else

      %do;

       %let g_vnumsum=%str();

       %put WARNING: No Numeric Variables to Sum;

      %end;

    quit;

    %mend;

    %vnumlist(sashelp, class, total)

    data want;

    set sashelp.class;

    &g_vnumsum;

    run;

    Super User
    Super User
    Posts: 7,076

    Re: Summing all numeric variables (flexible names and amount of variables)

    If you want to create a macro you can make it more efficient.

    First you do not need to query twice to find the number of variables. SAS will set the automatic variable SQLOBS with the count.

    Second can depend it not changing the value of the existing variable when there are no rows selected.

    %macro vnumlist(lib, sds, svar);

    %global g_vnumsum;

    %let g_vnumsum=;

    proc sql noprint;

       select name into :g_vnumsum separated by ' '

         from dictionary.columns

         where libname="%upcase(&lib)"

           and memname="%upcase(&sds)"

           and type='num'

       ;

    quit;

    %if &sqlobs %then %let g_vnumsum=&svar = sum(of &g_vnumsum);

    %else %put WARNING: No Numeric Variables to Sum;

    %mend vnumlist;

    Trusted Advisor
    Posts: 1,301

    Summing all numeric variables (flexible names and amount of variables)

    Nicely done Tom, thanks.

    Solution
    ‎10-11-2011 04:51 PM
    SAS Employee
    Posts: 104

    Re: Summing all numeric variables (flexible names and amount of variables)

    A more sophisticated macro application with extensive parameter checking would provide protection against this and more...  I thought I'd keep it simple as the original poster stated he was just learning SAS, and I didn't want to over complicate the issue.  However, something like this would be a good start towards a more robust app:

     
    %macro SumNums(DSN_In,DSN_Out, Sumvar);
    %local Varnames LIB DSN;
      /* Did the user provide an input dataset name? */
    %if &DSN_In= %then %do;
       %put ERROR: (SumNums) You must provide the name of the data set you wish to process.;
       %SyntaxHelp:
       %put NOTE- SYNTAX:  %NRSTR(%%SumNums%(DSN_In,DSN_Out,Sumvar%));
       %put NOTE-          DSN_In =Name of data set to process;
       %put NOTE-          DSN_Out=Name of output data set;
       %put NOTE-          Sumvar =Variable name to hold the sum (default is Total);
       %put NOTE- Example: %NRSTR(%%SumNums%(sashelp.class,work.test%));
       %GoTo EndMacro;
    %end;
    %let DSN_In=%QUPCASE(&DSN_In);
    
      /* Did the user ask for syntax help? */
    %if &DSN_In=!HELP %then %do;
       %GoTo SyntaxHelp;
    %end;
      /* Did the user provide an output dataset name? */
    %if &DSN_Out= %then %do;
       %put ERROR: (SumNums) You must provide the name of the output data set.;
       %GoTo SyntaxHelp;
    %end;
    %let DSN_Out=%QUPCASE(&DSN_Out);
      /* Is the sum variable given a valid SAS variable name? */
    %if &sumvar= %then %let Sumvar=Total;
    %if not %sysfunc(nvalid(%superq(Sumvar))) %then %do;
       %put ERROR: (SumNums) %superq(Sumvar) is not a valid SAS variable name.;
       %GoTo SyntaxHelp;
    %end;
    
    %let DSN=%QSCAN(&DSN_In,2,.);
    %if &DSN= %then %do;
       %let DSN=%SUPERQ(DSN_In);
       %let LIB=WORK;
    %end;
    %else %let LIB=%QSCAN(&DSN_In,1,.);
    
    proc sql noprint;
    select name into :Varnames separated by ' '
      from dictionary.columns
      where LIBNAME="&LIB" 
        and MEMNAME="&DSN"
        and TYPE='num'
    ;
    quit;
    
      /* Were there any numeric variables in the input data set? */
    %if &SQLOBS=0 %then %do;
       %put WARNING: (SumNums) dataset &DSN_In contained no numeric variables.;
       %GoTo EndMacro;
    %end;
      /* Do the deed! */
    data &DSN_out;
       set &DSN_In;
       &Sumvar=sum(of &varnames);
    run;
    
    %EndMacro:
    %Mend SumNums;
    

    After compiling, you just call the macro to do the deed.

    Example of a call for SYNTAX help:

    %SumNums(!HELP)

    Example of a call to calculate the total for SASHELP.CLASS:

    %SumNums(sashelp.class,work.test,MySumVar)
    Super User
    Super User
    Posts: 7,076

    Re: Summing all numeric variables (flexible names and amount of variables)

    I like the improved usability of using a single parameter to specify the dataset name versus separate libname and membername fields.

    • You are testing the value of the parameters before you have protected them with quoting functions.
    • You should check all of the parameters you can before bailing out.
    • You might want to test for existence of the input dataset.
    • You might want to exclude &SUMVAR from the list of variables to include in the sum.
    • For even more usability you can establish defaults for some parameters.  To me the only required parameter here is the target variable name.  The default to the input dataset could be &syslast.  The default for the output dataset could be null string so that SAS will create DATAnn where nn is the next number after the last DATAnn that it created.
    🔒 This topic is solved and locked.

    Need further help from the community? Please ask a new question.

    Discussion stats
    • 17 replies
    • 14397 views
    • 6 likes
    • 7 in conversation