Hello, everybody!
I have a program which generates a data set with one character variable and several numeric variables from data values with proc transpose. After that I need to add a variable to my data set which is equal to the sum of all these generated variables. The problem is that the names and amount of these variables are always different and I cant't get how to make SAS sum them I was googling all the evening long but haven't found the solution
Your help would be highly appreciated!
P.S. This is my first post here, so, please, be patient if I I posted my question in the wrong place or if the solution seems obvious for you, I am just learning SAS
A more sophisticated macro application with extensive parameter checking would provide protection against this and more... I thought I'd keep it simple as the original poster stated he was just learning SAS, and I didn't want to over complicate the issue. However, something like this would be a good start towards a more robust app:
%macro SumNums(DSN_In,DSN_Out, Sumvar); %local Varnames LIB DSN; /* Did the user provide an input dataset name? */ %if &DSN_In= %then %do; %put ERROR: (SumNums) You must provide the name of the data set you wish to process.; %SyntaxHelp: %put NOTE- SYNTAX: %NRSTR(%%SumNums%(DSN_In,DSN_Out,Sumvar%)); %put NOTE- DSN_In =Name of data set to process; %put NOTE- DSN_Out=Name of output data set; %put NOTE- Sumvar =Variable name to hold the sum (default is Total); %put NOTE- Example: %NRSTR(%%SumNums%(sashelp.class,work.test%)); %GoTo EndMacro; %end; %let DSN_In=%QUPCASE(&DSN_In); /* Did the user ask for syntax help? */ %if &DSN_In=!HELP %then %do; %GoTo SyntaxHelp; %end; /* Did the user provide an output dataset name? */ %if &DSN_Out= %then %do; %put ERROR: (SumNums) You must provide the name of the output data set.; %GoTo SyntaxHelp; %end; %let DSN_Out=%QUPCASE(&DSN_Out); /* Is the sum variable given a valid SAS variable name? */ %if &sumvar= %then %let Sumvar=Total; %if not %sysfunc(nvalid(%superq(Sumvar))) %then %do; %put ERROR: (SumNums) %superq(Sumvar) is not a valid SAS variable name.; %GoTo SyntaxHelp; %end; %let DSN=%QSCAN(&DSN_In,2,.); %if &DSN= %then %do; %let DSN=%SUPERQ(DSN_In); %let LIB=WORK; %end; %else %let LIB=%QSCAN(&DSN_In,1,.); proc sql noprint; select name into :Varnames separated by ' ' from dictionary.columns where LIBNAME="&LIB" and MEMNAME="&DSN" and TYPE='num' ; quit; /* Were there any numeric variables in the input data set? */ %if &SQLOBS=0 %then %do; %put WARNING: (SumNums) dataset &DSN_In contained no numeric variables.; %GoTo EndMacro; %end; /* Do the deed! */ data &DSN_out; set &DSN_In; &Sumvar=sum(of &varnames); run; %EndMacro: %Mend SumNums;
After compiling, you just call the macro to do the deed.
Example of a call for SYNTAX help:
%SumNums(!HELP)
Example of a call to calculate the total for SASHELP.CLASS:
%SumNums(sashelp.class,work.test,MySumVar)
data want;
set have;
total = sum(of _numeric_);
run;
Tom wrote:
data want;
set have;
total = sum(of _numeric_);
run;
It should be noted that with this syntax TOTAL will also appear on the RIGHT and will be summed. Of course in this example the value is always missing when the sums take place so all is well.
However, there could be problems with other functions particularly NMISS. It might be prudent to make a more carefully defined list.
I like to use arrays to create lists where I have more control of the elements, consider this example. Using the unexecuted SET is unnecessary but illustrates a more general solution where the list is derived in a more complex way.
data class;
if 0 then set sashelp.class(keep=_numeric_);
array v
_numeric_; set sashelp.class;
nmiss1 = nmiss(of _numeric_);
nmiss2 = nmiss(of v
); sum = sum(of v
); n = n(of v
); run;
proc print;
run;
Obs Age Height Weight Name Sex nmiss1 nmiss2 sum n
1 14 69.0 112.5 Alfred M 1 0 195.5 3
2 13 56.5 84.0 Alice F 1 0 153.5 3
3 13 65.3 98.0 Barbara F 1 0 176.3 3
4 14 62.8 102.5 Carol F 1 0 179.3 3
5 14 63.5 102.5 Henry M 1 0 180.0 3
6 12 57.3 83.0 James M 1 0 152.3 3
7 12 59.8 84.5 Jane F 1 0 156.3 3
8 15 62.5 112.5 Janet F 1 0 190.0 3
9 13 62.5 84.0 Jeffrey M 1 0 159.5 3
10 12 59.0 99.5 John M 1 0 170.5 3
11 11 51.3 50.5 Joyce F 1 0 112.8 3
12 14 64.3 90.0 Judy F 1 0 168.3 3
13 12 56.3 77.0 Louise F 1 0 145.3 3
14 15 66.5 112.0 Mary F 1 0 193.5 3
15 16 72.0 150.0 Philip M 1 0 238.0 3
16 12 64.8 128.0 Robert M 1 0 204.8 3
17 15 67.0 133.0 Ronald M 1 0 215.0 3
18 11 57.5 85.0 Thomas M 1 0 153.5 3
19 15 66.5 112.0 William M 1 0 193.5 3
beware use of _numeric_ in a general context as a data set might bring in only _character_ variables.
One (fairly) safe solution
Array v(*) _n_ _numeric_ ;
Should you wish to ignore that _n_ in do loops, just
Do index = 2 to dim(v);
As _n_ will always be numeric
num_total = sum( of v(*))-_ n_;
purists will point to the rare possibility that _n_ can become too large for integer precision and risk the num_total becoming incorrect.
Peter
Good point that _NUMERIC_ might be NULL and SAS will not like that. However adding a variable to unsure the array has at least one element would not be compatible with the statistic functions in general. Guess we can't have out cake and eat it too.:smileygrin: Unless we resort to code gen.
not hard to test an array having only one element to avoid trying to produce those stats.
In that kind of way maybe we should avoid producing a deviation of one, etc
Peter.C wrote:
not hard to test an array having only one element to avoid trying to produce those stats.
In that kind of way maybe we should avoid producing a deviation of one, etc
That's not the problem I was thinking about. Your example uses _N_ which is OK I suppose but it should be set to missing or better still use a DUMMY numeric that is always missing. Then you don't have to "correct" any of the results except NMISS or CMISS. At least I don't think any of the other statistic functions would need correcting.
This is a great place to use a little SQL and macro to more precisely controlwhich variables you sum up later. For my example, I'll use the CLASS data set in the SASHELP library.
1. First execute this SQL code (which can be used as a template) to get a list of all the numeric variables in the target dataset, collected a macro variable named VARLIST:
proc sql noprint; select name into :Varnames separated by ' ' from dictionary.columns where LIBNAME='SASHELP' and MEMNAME='CLASS' and TYPE='num'; quit;
2. Next, use the macro variable to write the list for you in the subsequent DATA step code:
data want; set sashelp.class; total=sum(of &varnames); run;
Partial results below:
Obs | Name | Sex | Age | Height | Weight | total |
---|---|---|---|---|---|---|
1 | Alfred | M | 14 | 69.0 | 112.5 | 195.5 |
2 | Alice | F | 13 | 56.5 | 84.0 | 153.5 |
3 | Barbara | F | 13 | 65.3 | 98.0 | 176.3 |
4 | Carol | F | 14 | 62.8 | 102.5 | 179.3 |
5 | Henry | M | 14 | 63.5 | 102.5 | 180.0 |
have you any protection against the source data set having no numeric variables?
total = sum(of total &varnames);
Remember add a %LET varnames=; statement BEFORE the SQL query so that any existing list of values is not retained in the macro variable. PROC SQL will not modify the value of the target macro variable when no rows are selected.
Of course this gets us back to the original reply.
total=sum(of _numeric_);
%macro vnumlist(lib, sds, svar);
%global g_vnumsum;
proc sql noprint;
select count(*) into :vnobs
from sashelp.vcolumn
where libname="%upcase(&lib)"
and memname="%upcase(&sds)"
and type='num';
%if &vnobs>0 %then
%do;
select name into :vnames separated by ' '
from sashelp.vcolumn
where libname="%upcase(&lib)"
and memname="%upcase(&sds)"
and type='num';
%let g_vnumsum=&svar=sum(of &vnames);
%end;
%else
%do;
%let g_vnumsum=%str();
%put WARNING: No Numeric Variables to Sum;
%end;
quit;
%mend;
%vnumlist(sashelp, class, total)
data want;
set sashelp.class;
&g_vnumsum;
run;
If you want to create a macro you can make it more efficient.
First you do not need to query twice to find the number of variables. SAS will set the automatic variable SQLOBS with the count.
Second can depend it not changing the value of the existing variable when there are no rows selected.
%macro vnumlist(lib, sds, svar);
%global g_vnumsum;
%let g_vnumsum=;
proc sql noprint;
select name into :g_vnumsum separated by ' '
from dictionary.columns
where libname="%upcase(&lib)"
and memname="%upcase(&sds)"
and type='num'
;
quit;
%if &sqlobs %then %let g_vnumsum=&svar = sum(of &g_vnumsum);
%else %put WARNING: No Numeric Variables to Sum;
%mend vnumlist;
Nicely done Tom, thanks.
A more sophisticated macro application with extensive parameter checking would provide protection against this and more... I thought I'd keep it simple as the original poster stated he was just learning SAS, and I didn't want to over complicate the issue. However, something like this would be a good start towards a more robust app:
%macro SumNums(DSN_In,DSN_Out, Sumvar); %local Varnames LIB DSN; /* Did the user provide an input dataset name? */ %if &DSN_In= %then %do; %put ERROR: (SumNums) You must provide the name of the data set you wish to process.; %SyntaxHelp: %put NOTE- SYNTAX: %NRSTR(%%SumNums%(DSN_In,DSN_Out,Sumvar%)); %put NOTE- DSN_In =Name of data set to process; %put NOTE- DSN_Out=Name of output data set; %put NOTE- Sumvar =Variable name to hold the sum (default is Total); %put NOTE- Example: %NRSTR(%%SumNums%(sashelp.class,work.test%)); %GoTo EndMacro; %end; %let DSN_In=%QUPCASE(&DSN_In); /* Did the user ask for syntax help? */ %if &DSN_In=!HELP %then %do; %GoTo SyntaxHelp; %end; /* Did the user provide an output dataset name? */ %if &DSN_Out= %then %do; %put ERROR: (SumNums) You must provide the name of the output data set.; %GoTo SyntaxHelp; %end; %let DSN_Out=%QUPCASE(&DSN_Out); /* Is the sum variable given a valid SAS variable name? */ %if &sumvar= %then %let Sumvar=Total; %if not %sysfunc(nvalid(%superq(Sumvar))) %then %do; %put ERROR: (SumNums) %superq(Sumvar) is not a valid SAS variable name.; %GoTo SyntaxHelp; %end; %let DSN=%QSCAN(&DSN_In,2,.); %if &DSN= %then %do; %let DSN=%SUPERQ(DSN_In); %let LIB=WORK; %end; %else %let LIB=%QSCAN(&DSN_In,1,.); proc sql noprint; select name into :Varnames separated by ' ' from dictionary.columns where LIBNAME="&LIB" and MEMNAME="&DSN" and TYPE='num' ; quit; /* Were there any numeric variables in the input data set? */ %if &SQLOBS=0 %then %do; %put WARNING: (SumNums) dataset &DSN_In contained no numeric variables.; %GoTo EndMacro; %end; /* Do the deed! */ data &DSN_out; set &DSN_In; &Sumvar=sum(of &varnames); run; %EndMacro: %Mend SumNums;
After compiling, you just call the macro to do the deed.
Example of a call for SYNTAX help:
%SumNums(!HELP)
Example of a call to calculate the total for SASHELP.CLASS:
%SumNums(sashelp.class,work.test,MySumVar)
I like the improved usability of using a single parameter to specify the dataset name versus separate libname and membername fields.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.