Solved: Re: selecting observations that fall on a certain decimal place

Caetreviop543 · Posted 08-14-2018 07:19 PM

Hi,

I am trying to select only observations that fall on the tenth, as opposed to hundredth decimal place (e.g. 1.10, 1.20, 1.30 vs. 1.11, 1.12, 1.13). The variable goes from one to five.

I tried:

data want;

set have;

do i=1 to 5 by .1;

where var=i;

end;

run;

I receive an error that i is not on file in my data set. Does anyone know why/a different approach?

Thanks in advance,

Emily

Haikuo · Posted 08-14-2018 10:30 PM

You could do a similar approach by using 'if' instead of 'where' plus some to achieve your goal, but it is neither efficient (in your example you need to check 40 numbers for one) nor robust (how do you know 1 to 5 will cover it?). Here is an alternative:

data want;
set have;
if mod(var,0.1)=0;
run;

You can't use 'where' in this case, as 'where' process data before loading into the PDV, aka, compile stage. in your case, you need to process data during run-time stage.

View solution in original post

shahparth260 · Posted 08-14-2018 09:28 PM

data test;
do i=1 to 5 by 0.1;
output;
end;
run;

Can you try this? I hope it helps.

PS

Caetreviop543 · Posted 08-15-2018 04:13 PM

Hi,

I did try this, but it only gives me one column, i, from 1 to 5 by .1. I need the new data set to retain variables from my previous data set, which is why I used a set statement in the original code.

Does that make sense? Thanks for your help.

Haikuo · Posted 08-14-2018 10:30 PM

You could do a similar approach by using 'if' instead of 'where' plus some to achieve your goal, but it is neither efficient (in your example you need to check 40 numbers for one) nor robust (how do you know 1 to 5 will cover it?). Here is an alternative:

data want;
set have;
if mod(var,0.1)=0;
run;

You can't use 'where' in this case, as 'where' process data before loading into the PDV, aka, compile stage. in your case, you need to process data during run-time stage.

Astounding · Posted 08-15-2018 08:46 AM

@Haikuo has given you the right solution. I just wanted to clarify some of the rules.

WHERE requires that the variables you refer to are part of the incoming data set. In your original program, VAR exists, but I does not (not as part of the incoming data set, anyway). So a WHERE statement can refer to VAR but not to I. That's why your program generates an error. In @Haikuo's solution, since VAR already exists, you should be able to change IF to WHERE.

Haikuo · Posted 08-15-2018 09:18 AM

Thanks for clarification, Robert. Looking back to my post, one can easily be confused, especially when I used 'if'. I am not editing it since we already have your post.

Caetreviop543 · Posted 08-15-2018 04:20 PM

Thanks! I'm not familiar with the mod function, but that worked.

FreelanceReinh · Posted 08-15-2018 12:46 PM

Hi Emily,

You were actually lucky that SAS didn't accept your code due to the inappropriate WHERE statement. If you had used syntactically correct code such as

if var=i then output;

your selection would most likely have been incorrect: The risk of not selecting numbers like 1.2 or 2.5 (which meet your selection criterion) would have been very high because of numeric representation issues. A 0.1 increment in a DO loop is one of the standard examples of how numeric representation errors can accumulate to rounding errors which invalidate equations like var=i (see documentation).

@Haikuo's solution is much better in this regard too. To quote the documentation: "The MOD function performs extra computations, called fuzzing, to return an exact zero when the result would otherwise differ from zero because of numerical error."

Depending on how you obtained the values of var, it could still make sense to use (slightly) rounded values in selection criteria, e.g.

round(var, 1e-9)

Registration is open

SAS Training: Just a Click Away