02-07-2017 12:05 PM
I am trying to create a new variable called "Diff" by subtracting two columns. When those columns have missing variables, I want diff to be missing too. In a data step, I tried "diff = sales-cost;" and got an error. Then I tried "diff=sum(sales,-cost);" which worked but gave me values for all of diff, when there should be some missing. I want it to look like:
sales cost diff
. 8.99 .
and right now it looks like:
sales cost diff
. 8.99 -8.99
02-07-2017 12:10 PM
What's the error you are getting. I ran the sample code and I get the output as you wanted.
Also refer the difference between sum statement and sum variable.
data have ; input sale cost ; Diff=sale-cost ; datalines ; . 10.23 15 3.72 17.23 . ; run ; proc print data=have ;run ;
02-07-2017 12:23 PM
please show a case where you use
and you fail to get diff=. when either sales or price (or both) is missing.
... unless you have a subsequent "diff=' assignment later in the program.
02-07-2017 12:27 PM
I do reference diff again, as I am making another categorical variable called "change"
input unitcostprice unitsalesprice;
diff = unitsalesprice - unitcostprice;
if diff <= 10 then Change="LOW";
else if 10 < diff <= 20 then Change="MED";
There are no missing values for change and I want there to be.
02-07-2017 12:30 PM
Diff is missing, but missing values compare low to all valid numerics (like 10). So change='LOW'.
if not(missing(diff)) and diff < 10 then change='LOW';
02-07-2017 12:31 PM
Use the below code. I added one more condition. Since your first condition checks for any values less or equal to 10 the nulls/missing falls into that bucket. You can correct it by checking for them.
data have ; input sale cost ; Diff=sale-cost ; if diff <= 10 and diff > 0 then Change="LOW"; else if 10 < diff <= 20 then Change="MED"; else Change="HI"; datalines ; . 10.23 15 3.72 17.23 . ; run ; proc print data=have ;run ;