Hi,
I am trying to join two datasets A and B. While joining the two datasets I am applying the follwoing conditions.
if (A1=B1) AND (A2=B2) AND (ABS(A3-B3) LE 0.05 OR ABS(A4-B4) LE 0.05)
in the above condition, if either A3,B3,A4,B4 becomes equal to missing values. the result of the condition ((ABS(A3-B3) LE 0.05 OR ABS(A4-B4) LE 0.05)) always becomes . and always less than 0.05
Is there any way to treat missing values as largest possible numbers.
Thanks in advance,
Sheeba Swaminathan
Do A3 and B3 or A4 and B4 ever both have missing values? If so you may have to add some additional levels of comparison such as
( ( ABS(A3-B3) LE 0.05) and not ( missing(A3) and Missing(B3) ) )
but more details such as actual values and the desired results may be needed. Some time it may be better to subset some of the data in easier chunks and then recombine.
if you use the sum function instead of the plain math, the missing value is handled differently
for example
data temp;
a = .;
b= 5;
c=sum(a,-b); /* c evaluates to -5 */
d = a-b; /* d evaluates to missing */
run;
Try:
if (A1=B1) AND (A2=B2) AND (ABS(SUM(A3,-B3)) LE 0.05 OR ABS(SUM(A4,-B4)) LE 0.05)
Hi Tmiles,
Thanks a lot for the reply.
I am worried about the situation where both are missing values. In this case sum function again will evaluate to 0 and again it will become less than 0.05
I modified the condition to the following to handle the missing values by adding zero to each but again if both A4,B4 turns out to missing . this will result in zero and will become less than 0.05.
if (A1=B1) AND (A2=B2) AND (abs(sum(A3,0) - sum(B3,0)) le 0.05) or (abs(sum(A4,0) - sum(B4,0)) le .05)
Regards,
sheeba
You could always check for missing values prior to the subsetting IF and set to a default value. THis will only help if only 1 side of the equation is missing.
Is it safe to assume if both sides of the equation are missing you want to handle the condition differently? If so perhaps If Then Else logic would get you thru it.
something like:
if sum(a3,b3,a4,b4) > 0 then do;
if (A1=B1) AND (A2=B2) AND (ABS(SUM(A3,-B3)) LE 0.05 OR ABS(SUM(A4,-B4)) LE 0.05) then ??;
end;
else do;
???
end;
Hi Tmiles,
Thanks for the quick reply.
Yes. I wouldnt want the match if both are missing values. Also I am populating this conditions dynamically .
I will try this out.
Regards,
Sheeba
Do A3 and B3 or A4 and B4 ever both have missing values? If so you may have to add some additional levels of comparison such as
( ( ABS(A3-B3) LE 0.05) and not ( missing(A3) and Missing(B3) ) )
but more details such as actual values and the desired results may be needed. Some time it may be better to subset some of the data in easier chunks and then recombine.
Hi Ballardw,
Thanks a lot for the reply.
right now the situation of getting missing values in both the columns doesnt exist but i would like to make modifications to the code to handle such situations as well. tnx a lot for the code.
Also i will consider subsetting the data to filter out this conditions.
Thanks again,
Regards,
Sheeba
If you are concerned that when you code a condition like
(A <= 0.5)
That missing values of A cause the condition to be true then just change your condition to account for missing values.
(.Z < A <= 0.5)
Or
(A <= 0.5 and not missing(A))
In your specific example you could just remove the ABS() function and code the positive and negative ranges.
-0.5 <= (A3-B3) <= 0.05
Hi Tom,
Tnx a lot for the suggestions. This is really helpful.
Regards,
sheeba .
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.