Hi,
i want a way to check to see if any variables in my named range are negative. is there a more efficient way of accomplishing this ?
data have;
input (x1-x3) (2.);
cards;
0 1 2
1 -2 3
3 4 5
-1 0 2
2 2 2
3 3 -3
;
run;
data want;
set have;
if x1<0 or
x2<0 or
x3<0;
run;
actually if using variable list, you can just do this:
data want;
set have;
if min(of x1-x3)<0 then output;
run;
Haikuo
Use array() to save some typing if you have many vars:
data have;
input x1-x3;
cards;
0 1 2
1 -2 3
3 4 5
-1 0 2
2 2 2
3 3 -3
;
run;
data want;
set have;
array x x1-x3;
do over x;
if x<0 then output;
end;
run;
proc print;run;
Regards,
Haikuo
actually if using variable list, you can just do this:
data want;
set have;
if min(of x1-x3)<0 then output;
run;
Haikuo
Hi,
Try this..
data have2(drop=i);
input (x1-x3) (2.);
array xx(3) x1-x3;
do i = 1 to dim(xx);
if xx(i)<0 then output;
end;
cards;
0 1 2
1 -2 3
3 4 5
-1 0 2
2 2 2
3 3 -3
;
run;
Thanks,
Shiva
If you're outputting the whole row then check if the min is less than 0.
However this does require you to list your variables whereas in the array method you don't.
The where will process faster though.
data want;
set have;
where min(x1, x2, x3) <0;
run;
It is likely to run faster if you switch from an IF to a WHERE statement. (Whether you can notice the difference depends on characteristics of your data such as how many observations you are selecting, and how wide the observations are.)
If you know which variable(s) are most likely to be negative, you could switch the order around. For example if X3 is usually the culprit, mention it first:
where x3 < 0 or x1 < 0 or x2 < 0;
Finally, notice that this code also selects missing values, not just negative numbers.
All of this is dependent on what you mean by "more efficient". I've interpreted it to mean you want something that would run faster, but that's not the only possible definition.
Good luck.
Astounding , as he/she always, just raised valid point on dealing with missing value. So just add: if .<min(of x1-x3)<0 then;
However, from my own experience, it is NOT always that 'where' is faster then 'if'. If you are dealing with a big data set with more than half of the obs meeting condition, 'if' could be significant faster than 'where'. I have noticed this in numerous counts in my work. That being said, direct access is not always faster than 'sequential access'.
Just my 2 cents,
Haikuo
You can't use the OF in a WHERE statement, but you can in the IF statement.
Didn't know that! Thanks! Right now, I mostly use 'where' in procs and index, where you can't use 'if' or 'if' does not utilize index.
Regards,
Haikuo
hi ... since MIN ignores missing values (unless all are missing), I don't think that you have to do this ... if .<min(of x1-x3)<0
to cover situations when all are missing, you could try just adding a non-negative value to the function arguments ...
data have;
input x1-x3 @@;
cards;
. 1 2 1 -2 3 3 4 . -1 0 2 2 2 2 3 3 -3 . . .
;
data negative;
set have;
if min(of x:, 0) lt 0;
run;
x1 x2 x3
1 -2 3
-1 0 2
3 3 -3
Thanks everybody for your input!
@Astounding, I should have said by efficient i mean not having to explicitly type out all vars between var1 & varX. in reality my data set has a named range with 100 fields.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.