Hi,
I am trying to select only observations that fall on the tenth, as opposed to hundredth decimal place (e.g. 1.10, 1.20, 1.30 vs. 1.11, 1.12, 1.13). The variable goes from one to five.
I tried:
data want;
set have;
do i=1 to 5 by .1;
where var=i;
end;
run;
I receive an error that i is not on file in my data set. Does anyone know why/a different approach?
Thanks in advance,
Emily
You could do a similar approach by using 'if' instead of 'where' plus some to achieve your goal, but it is neither efficient (in your example you need to check 40 numbers for one) nor robust (how do you know 1 to 5 will cover it?). Here is an alternative:
data want;
set have;
if mod(var,0.1)=0;
run;
You can't use 'where' in this case, as 'where' process data before loading into the PDV, aka, compile stage. in your case, you need to process data during run-time stage.
data test;
do i=1 to 5 by 0.1;
output;
end;
run;
Can you try this? I hope it helps.
Hi,
I did try this, but it only gives me one column, i, from 1 to 5 by .1. I need the new data set to retain variables from my previous data set, which is why I used a set statement in the original code.
Does that make sense? Thanks for your help.
You could do a similar approach by using 'if' instead of 'where' plus some to achieve your goal, but it is neither efficient (in your example you need to check 40 numbers for one) nor robust (how do you know 1 to 5 will cover it?). Here is an alternative:
data want;
set have;
if mod(var,0.1)=0;
run;
You can't use 'where' in this case, as 'where' process data before loading into the PDV, aka, compile stage. in your case, you need to process data during run-time stage.
@Haikuo has given you the right solution. I just wanted to clarify some of the rules.
WHERE requires that the variables you refer to are part of the incoming data set. In your original program, VAR exists, but I does not (not as part of the incoming data set, anyway). So a WHERE statement can refer to VAR but not to I. That's why your program generates an error. In @Haikuo's solution, since VAR already exists, you should be able to change IF to WHERE.
Thanks for clarification, Robert. Looking back to my post, one can easily be confused, especially when I used 'if'. I am not editing it since we already have your post.
Thanks! I'm not familiar with the mod function, but that worked.
Hi Emily,
You were actually lucky that SAS didn't accept your code due to the inappropriate WHERE statement. If you had used syntactically correct code such as
if var=i then output;
your selection would most likely have been incorrect: The risk of not selecting numbers like 1.2 or 2.5 (which meet your selection criterion) would have been very high because of numeric representation issues. A 0.1 increment in a DO loop is one of the standard examples of how numeric representation errors can accumulate to rounding errors which invalidate equations like var=i (see documentation).
@Haikuo's solution is much better in this regard too. To quote the documentation: "The MOD function performs extra computations, called fuzzing, to return an exact zero when the result would otherwise differ from zero because of numerical error."
Depending on how you obtained the values of var, it could still make sense to use (slightly) rounded values in selection criteria, e.g.
round(var, 1e-9)
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.