BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
gorba004
Calcite | Level 5

I am working with a dataset that has 110,000 observations and 170 variables. I am trying to delete all the observations that are 0 in all of around 30 variables. They must be 0 in all of the 30 variables. This is what I have done so far: 

DATA new_set;

SET old_set (WHERE= (var1<>0 AND var2<>0 AND var3<>0 AND var4<>0 AND....var30<>0));

RUN; 

The variables are not in actually numbered 1-30, are not all in a row, and are sparsely scattered about the other 170 in the dataset. 

Is there an easier or more efficient way to do this? 

 

I am using SAS 9.4 (Unicode) 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @gorba004 and welcome to the SAS Support Communities!

 

First of all, please note that your suggested WHERE condition is inconsistent with your requirement: You would need to replace "AND" with "OR" to correct this. (The "<>" works like "NE" in a WHERE condition, but I still wouldn't use it for the reason @ballardw has explained.)

 

As has been suggested, you can use functions with an OF variable-list argument to simplify the condition, where variable-list may be a list of arbitrary variable names (it's not limited to numbered range lists etc.). If you need more than one function, you can use an array to avoid duplicating the long list.

 

Example:

data new_set;
array x[*] list your variables here;
set old_set;
if not(nmiss(of x[*])=sumabs(of x[*])=0);
run;

View solution in original post

14 REPLIES 14
novinosrin
Tourmaline | Level 20
data have;
input var1-var3;
cards;
0 0 0
0 1 0
0 1 2
;


data want;
set have;
k=compress(cats(of var1-var3),'0');
if missing(k) then delete;
drop k;
run;
SASKiwi
PROC Star

The 30 variables appear to be numeric so this should work:

if sum(of var1-var30) = 0 then delete;
ballardw
Super User

@SASKiwi wrote:

The 30 variables appear to be numeric so this should work:

if sum(of var1-var30) = 0 then delete;

I would hesitate to use that alone. Since the OP is attempting to use <> as not equal there may actually be negative values. So if there is a -1 and a +1 the sum would be 0 . Sum=0 and Range=0 might be sufficient though.

 

@gorba004 

You do not want to use <> if looking to do a "not equal", use one of: NE  ^=

Examine this code and the result:

data example;
   x= 5;
   y1 = x <> 2;
   y2 = x <> 7;
   z =  x ne 2;
run;

The <> returns the larger of the two values. The x ne 2 returns a 1 for true.

And while SAS will treat any number not 0 or missing as "True" you may get very unexpected results from comparisons if you use <> instead of NE if you intend "not equal" for the comparison.

 

SASKiwi
PROC Star

@ballardw - good point. If data is all greater than or equal to zero it should be OK. 

ballardw
Super User

@SASKiwi wrote:

@ballardw - good point. If data is all greater than or equal to zero it should be OK. 


 

Care to guess how much time I spent diagnosing a very similar issue years ago where a very small number of records had that offsetting sum behavior? Though in the case I had it was a combination of 4 variables where one negative value just happened to match the total of 3 positive...

novinosrin
Tourmaline | Level 20

That's exactly what i was wondering   even a champion like Sir @SASKiwi  who taught me effective boolean expressions can get "jetlagged" on way to dallas? 🙂

 

 

 

 

SASKiwi
PROC Star

@novinosrin - quite likely. I've got a 3 hour followed by a 15.5 hour flight to contend with....Smiley Happy

novinosrin
Tourmaline | Level 20

Have a safe trip and great fun sir.  Pass on my regards to our coomunity folks in SGF. Cheers!

FreelanceReinh
Jade | Level 19

Hello @gorba004 and welcome to the SAS Support Communities!

 

First of all, please note that your suggested WHERE condition is inconsistent with your requirement: You would need to replace "AND" with "OR" to correct this. (The "<>" works like "NE" in a WHERE condition, but I still wouldn't use it for the reason @ballardw has explained.)

 

As has been suggested, you can use functions with an OF variable-list argument to simplify the condition, where variable-list may be a list of arbitrary variable names (it's not limited to numbered range lists etc.). If you need more than one function, you can use an array to avoid duplicating the long list.

 

Example:

data new_set;
array x[*] list your variables here;
set old_set;
if not(nmiss(of x[*])=sumabs(of x[*])=0);
run;
novinosrin
Tourmaline | Level 20
if not(nmiss(of x[*])=sumabs(of x[*])=0);

Super slick 

gorba004
Calcite | Level 5

Thanks so much, this worked like a charm

Shmuel
Garnet | Level 18

Using array makes life easier:

data want;
 set have;
      array varx var1-var30;
      do i=1 to dim(varx);
           if varx(i) <> 0 then do; output; leave; end;
      end;
      drop i ;
run;
KachiM
Rhodochrosite | Level 12

You want to delete a row of variables each having a value of zero. Don't you want to delete Columns having zeros?

For instance,

 

data have;
input var1-var3;
cards;
0 0 0
0 1 0
0 1 2
;
run;

If you delete the first row, you will have the entire column (VAR1) with zeros. Do you want VAR1 to be kept?

 

Ksharp
Super User
data have;
input var1-var3;
cards;
0 0 0
0 1 0
0 1 2
;


data want;
set have;
array x{*} var1-var3;
if min(of x{*})=0 and max(of x{*})=0 then delete;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 2927 views
  • 4 likes
  • 8 in conversation