turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Treat Missing values as largest possible values

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-23-2016 11:08 AM

Hi,

I am trying to join two datasets A and B. While joining the two datasets I am applying the follwoing conditions.

if (A1=B1) AND (A2=B2) AND (ABS(A3-B3) LE 0.05 OR ABS(A4-B4) LE 0.05)

in the above condition, if either A3,B3,A4,B4 becomes equal to missing values. the result of the condition ((ABS(A3-B3) LE 0.05 OR ABS(A4-B4) LE 0.05)) always becomes . and always less than 0.05

Is there any way to treat missing values as largest possible numbers.

Thanks in advance,

Sheeba Swaminathan

Accepted Solutions

Solution

12-23-2016
01:20 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Sheeba

12-23-2016 12:10 PM

Do A3 and B3 or A4 and B4 ever both have missing values? If so you may have to add some additional levels of comparison such as

( ( ABS(A3-B3) LE 0.05) and not ( missing(A3) and Missing(B3) ) )

but more details such as actual values and the desired results may be needed. Some time it may be better to subset some of the data in easier chunks and then recombine.

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Sheeba

12-23-2016 11:56 AM - edited 12-23-2016 11:57 AM

if you use the sum function instead of the plain math, the missing value is handled differently

for example

data temp;

a = .;

b= 5;

c=sum(a,-b); /* c evaluates to -5 */

d = a-b; /* d evaluates to missing */

run;

Try:

if (A1=B1) AND (A2=B2) AND (ABS(SUM(A3,-B3)) LE 0.05 OR ABS(SUM(A4,-B4)) LE 0.05)

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to TMiles

12-23-2016 12:32 PM

Hi Tmiles,

Thanks a lot for the reply.

I am worried about the situation where both are missing values. In this case sum function again will evaluate to 0 and again it will become less than 0.05

I modified the condition to the following to handle the missing values by adding zero to each but again if both A4,B4 turns out to missing . this will result in zero and will become less than 0.05.

if (A1=B1) AND (A2=B2) AND (abs(sum(A3,0) - sum(B3,0)) le 0.05) or (abs(sum(A4,0) - sum(B4,0)) le .05)

Regards,

sheeba

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Sheeba

12-23-2016 12:44 PM - edited 12-23-2016 12:45 PM

You could always check for missing values prior to the subsetting IF and set to a default value. THis will only help if only 1 side of the equation is missing.

Is it safe to assume if both sides of the equation are missing you want to handle the condition differently? If so perhaps If Then Else logic would get you thru it.

something like:

if sum(a3,b3,a4,b4) > 0 then do;

if (A1=B1) AND (A2=B2) AND (ABS(SUM(A3,-B3)) LE 0.05 OR ABS(SUM(A4,-B4)) LE 0.05) then ??;

end;

else do;

???

end;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to TMiles

12-23-2016 01:14 PM

Hi Tmiles,

Thanks for the quick reply.

Yes. I wouldnt want the match if both are missing values. Also I am populating this conditions dynamically .

I will try this out.

Regards,

Sheeba

Solution

12-23-2016
01:20 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Sheeba

12-23-2016 12:10 PM

Do A3 and B3 or A4 and B4 ever both have missing values? If so you may have to add some additional levels of comparison such as

( ( ABS(A3-B3) LE 0.05) and not ( missing(A3) and Missing(B3) ) )

but more details such as actual values and the desired results may be needed. Some time it may be better to subset some of the data in easier chunks and then recombine.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

12-23-2016 12:37 PM

Hi Ballardw,

Thanks a lot for the reply.

right now the situation of getting missing values in both the columns doesnt exist but i would like to make modifications to the code to handle such situations as well. tnx a lot for the code.

Also i will consider subsetting the data to filter out this conditions.

Thanks again,

Regards,

Sheeba

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Sheeba

12-23-2016 02:43 PM - edited 12-23-2016 02:49 PM

If you are concerned that when you code a condition like

`(A <= 0.5)`

That missing values of A cause the condition to be true then just change your condition to account for missing values.

`(.Z < A <= 0.5)`

Or

`(A <= 0.5 and not missing(A))`

In your specific example you could just remove the ABS() function and code the positive and negative ranges.

`-0.5 <= (A3-B3) <= 0.05 `

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-27-2016 02:20 PM

Hi Tom,

Tnx a lot for the suggestions. This is really helpful.

Regards,

sheeba .