Hi,
I am trying to join two datasets A and B. While joining the two datasets I am applying the follwoing conditions.
if (A1=B1) AND (A2=B2) AND (ABS(A3-B3) LE 0.05 OR ABS(A4-B4) LE 0.05)
in the above condition, if either A3,B3,A4,B4 becomes equal to missing values. the result of the condition ((ABS(A3-B3) LE 0.05 OR ABS(A4-B4) LE 0.05)) always becomes . and always less than 0.05
Is there any way to treat missing values as largest possible numbers.
Thanks in advance,
Sheeba Swaminathan
Do A3 and B3 or A4 and B4 ever both have missing values? If so you may have to add some additional levels of comparison such as
( ( ABS(A3-B3) LE 0.05) and not ( missing(A3) and Missing(B3) ) )
but more details such as actual values and the desired results may be needed. Some time it may be better to subset some of the data in easier chunks and then recombine.
if you use the sum function instead of the plain math, the missing value is handled differently
for example
data temp;
a = .;
b= 5;
c=sum(a,-b); /* c evaluates to -5 */
d = a-b; /* d evaluates to missing */
run;
Try:
if (A1=B1) AND (A2=B2) AND (ABS(SUM(A3,-B3)) LE 0.05 OR ABS(SUM(A4,-B4)) LE 0.05)
Hi Tmiles,
Thanks a lot for the reply.
I am worried about the situation where both are missing values. In this case sum function again will evaluate to 0 and again it will become less than 0.05
I modified the condition to the following to handle the missing values by adding zero to each but again if both A4,B4 turns out to missing . this will result in zero and will become less than 0.05.
if (A1=B1) AND (A2=B2) AND (abs(sum(A3,0) - sum(B3,0)) le 0.05) or (abs(sum(A4,0) - sum(B4,0)) le .05)
Regards,
sheeba
You could always check for missing values prior to the subsetting IF and set to a default value. THis will only help if only 1 side of the equation is missing.
Is it safe to assume if both sides of the equation are missing you want to handle the condition differently? If so perhaps If Then Else logic would get you thru it.
something like:
if sum(a3,b3,a4,b4) > 0 then do;
if (A1=B1) AND (A2=B2) AND (ABS(SUM(A3,-B3)) LE 0.05 OR ABS(SUM(A4,-B4)) LE 0.05) then ??;
end;
else do;
???
end;
Hi Tmiles,
Thanks for the quick reply.
Yes. I wouldnt want the match if both are missing values. Also I am populating this conditions dynamically .
I will try this out.
Regards,
Sheeba
Do A3 and B3 or A4 and B4 ever both have missing values? If so you may have to add some additional levels of comparison such as
( ( ABS(A3-B3) LE 0.05) and not ( missing(A3) and Missing(B3) ) )
but more details such as actual values and the desired results may be needed. Some time it may be better to subset some of the data in easier chunks and then recombine.
Hi Ballardw,
Thanks a lot for the reply.
right now the situation of getting missing values in both the columns doesnt exist but i would like to make modifications to the code to handle such situations as well. tnx a lot for the code.
Also i will consider subsetting the data to filter out this conditions.
Thanks again,
Regards,
Sheeba
If you are concerned that when you code a condition like
(A <= 0.5)
That missing values of A cause the condition to be true then just change your condition to account for missing values.
(.Z < A <= 0.5)
Or
(A <= 0.5 and not missing(A))
In your specific example you could just remove the ABS() function and code the positive and negative ranges.
-0.5 <= (A3-B3) <= 0.05
Hi Tom,
Tnx a lot for the suggestions. This is really helpful.
Regards,
sheeba .
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.