Hi Everyone,
Running into an issue and I'm not sure what I'm missing here. I have 8 numeric variables, that I sum together, and then subtract a 9th numeric variable from the summation. When I simply sum the first 8 I get a valid output, however, when I try subtracting the 9th I get a period (.) There are some instances of missing values in all 9 columns, that's why I thought sum() was the best approach.
Example code below:
DATA WORK.WANT; SET WORK.HAVE; TotalCount1 = sum(of Total1 -- Total8); TotalCount = sum(TotalCount1 -Total9); DROP TotalCount1; RUN;
I'd prefer, this step below, but neither have worked
DATA WORK.WANT; SET WORK.HAVE; TotalCount = sum(sum(Total1 -- Total8) - Total9); RUN;
Hi @BlayLay,
It is good practice to use IF conditions to deal with missing values appropriately and to avoid notes like "Missing values were generated ..." in the log.
Example:
data want;
set have;
if n(of Total1-Total8)=0 then TotalCount=.;
else TotalCount=sum(of Total1-Total8);
if n(Total9) then TotalCount=sum(TotalCount, -Total9);
run;
This will result in a missing value of TotalCount only if all nine totals are missing. In all other cases missing values are treated as zeros in the calculation. If your rules are different, it will be no problem to adapt the code correspondingly.
Try
TotalCount = sum(of Total1 -- Total8) - Total9;
@BlayLay wrote:
No luck...still missing values as output
I did not say you would get a result, but you don't have logic or syntax errors.
Provide example values of all 9 variables where the result is missing and what you expect for a result.
Best would be as a data step.
Have you been trying to use -Total9 as part of the Sum function because Total9 is sometimes missing and want the sum of the rest in that case?
Or are you just trying to remove the note from the log?
Any subtraction operation will result in a missing result if the value of any of the operands is missing.
If you expect 10 - . to have a value of 10 then your "missing" is incorrect and requires a value of 0 for numeric operations.
If you don't want the note, as @Tom said, you need to check the value before the subtraction:
if not missing(total9) then TotalCount = sum(of Total1 -- Total8) - Total9;
but, unless you tell what you expect as a non-missing result there is no way to provide one.
A special missing could be provide if Total9 is missing to indicate that was why.
if not missing(total9) then TotalCount = sum(of Total1 -- Total8) - Total9; else TotalCount = .N;
The .N is still missing and is used in numeric calculation just like missing but you can display the value, best with a custom format, that lets you know specifically why it was missing.
What do you expect for a value when Total1 through Total 8 are missing but Total9 is not?
@ballardw to your point, "Any subtraction operation will result in a missing result if the value of any of the operands is missing" was my concern and why I didn't go with a basic subtraction function. Due to the fact some columns may have missing values. Below is a small-scale example, and i've added "TotalCount" as the desired output. Same concept applies, summing (Total1 -- Total2) and then subtracting Total3 to get TotalCount.
Total1 | Total2 | Total3 | TotalCount |
1 | 0 | 1 | 0 |
1 | 1 | . | 2 |
0 | 0 | 1 | -1 |
@mklangley code seems to work, however, I just wanted to make sure that I'm going about this correctly.
By using - instead of comma in the SUM() function call you are summing only one value, the result of the normal subtraction. So you are defeating the purpose of using the SUM() function.
TotalCount = sum(of total1-total8, -total9);
If you want to avoid the NOTE in the log about missing values you will still have to check if TOTAL9 is missing.
Try this:
data have;
input Total1-Total9;
datalines;
. 1 1 1 1 1 1 1 1
1 . 1 1 1 1 1 1 1
1 1 . 1 1 1 1 1 1
1 1 1 . 1 1 1 1 1
1 1 1 1 . 1 1 1 1
1 1 1 1 1 . 1 1 1
1 1 1 1 1 1 . 1 1
1 1 1 1 1 1 1 . 1
1 1 1 1 1 1 1 1 .
;
run;
data want;
set have;
TotalCount = sum(sum(of Total1 -- Total8), -Total9);
run;
Hi @BlayLay,
It is good practice to use IF conditions to deal with missing values appropriately and to avoid notes like "Missing values were generated ..." in the log.
Example:
data want;
set have;
if n(of Total1-Total8)=0 then TotalCount=.;
else TotalCount=sum(of Total1-Total8);
if n(Total9) then TotalCount=sum(TotalCount, -Total9);
run;
This will result in a missing value of TotalCount only if all nine totals are missing. In all other cases missing values are treated as zeros in the calculation. If your rules are different, it will be no problem to adapt the code correspondingly.
You can use an extra SUM() function call (or a COALESCE() function call) to handle the missing values without notes in the log.
total=sum(of x1-x3,-sum(0,x4));
But that will return zero when all values are missing. If you want missing in that case then test whether there is at least one non-missing value. You could use the N() function.
Here is a simple example with 4 input variables.
data test;
do x1=.,1; do x2=.,2; do x3=.,4; do x4=.,1;
output;
end;end;end;end;
run;
data want;
set test;
if n(of x1-x4) then total=sum(of x1-x3,-sum(0,x4));
run;
Results:
Obs x1 x2 x3 x4 total 1 . . . . . 2 . . . 1 -1 3 . . 4 . 4 4 . . 4 1 3 5 . 2 . . 2 6 . 2 . 1 1 7 . 2 4 . 6 8 . 2 4 1 5 9 1 . . . 1 10 1 . . 1 0 11 1 . 4 . 5 12 1 . 4 1 4 13 1 2 . . 3 14 1 2 . 1 2 15 1 2 4 . 7 16 1 2 4 1 6
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.