Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- How can I better Delete Many Rows that meet many conditions

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 04-19-2019 02:52 PM
(1920 views)

I am working with a dataset that has 110,000 observations and 170 variables. I am trying to delete all the observations that are 0 in all of around 30 variables. They must be 0 in all of the 30 variables. This is what I have done so far:

DATA new_set;

SET old_set (WHERE= (var1<>0 AND var2<>0 AND var3<>0 AND var4<>0 AND....var30<>0));

RUN;

The variables are not in actually numbered 1-30, are not all in a row, and are sparsely scattered about the other 170 in the dataset.

Is there an easier or more efficient way to do this?

I am using SAS 9.4 (Unicode)

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello @gorba004 and welcome to the SAS Support Communities!

First of all, please note that your suggested WHERE condition is inconsistent with your requirement: You would need to replace "AND" with "OR" to correct this. (The "<>" works like "NE" *in a WHERE condition*, but I still wouldn't use it for the reason @ballardw has explained.)

As has been suggested, you can use functions with an OF *variable-list* argument to simplify the condition, where *variable-list* may be a list of arbitrary variable names (it's not limited to numbered range lists etc.). If you need more than one function, you can use an array to avoid duplicating the long list.

Example:

```
data new_set;
array x[*] list your variables here;
set old_set;
if not(nmiss(of x[*])=sumabs(of x[*])=0);
run;
```

14 REPLIES 14

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

```
data have;
input var1-var3;
cards;
0 0 0
0 1 0
0 1 2
;
data want;
set have;
k=compress(cats(of var1-var3),'0');
if missing(k) then delete;
drop k;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The 30 variables appear to be numeric so this should work:

`if sum(of var1-var30) = 0 then delete;`

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@SASKiwi wrote:

The 30 variables appear to be numeric so this should work:

`if sum(of var1-var30) = 0 then delete;`

I would hesitate to use that alone. Since the OP is attempting to use <> as not equal there may actually be negative values. So if there is a -1 and a +1 the sum would be 0 . Sum=0 and Range=0 might be sufficient though.

You do not want to use <> if looking to do a "not equal", use one of: NE ^=

Examine this code and the result:

data example; x= 5; y1 = x <> 2; y2 = x <> 7; z = x ne 2; run;

The <> returns the larger of the two values. The x ne 2 returns a 1 for true.

And while SAS will treat any number not 0 or missing as "True" you may get very unexpected results from comparisons if you use <> instead of NE if you intend "not equal" for the comparison.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@ballardw - good point. If data is all greater than or equal to zero it should be OK.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@SASKiwi wrote:

@ballardw - good point. If data is all greater than or equal to zero it should be OK.

Care to guess how much time I spent diagnosing a very similar issue years ago where a very small number of records had that offsetting sum behavior? Though in the case I had it was a combination of 4 variables where one negative value just happened to match the total of 3 positive...

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

**jetlagged"** on way to dallas? 🙂

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@novinosrin - quite likely. I've got a 3 hour followed by a 15.5 hour flight to contend with....

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Have a safe trip and great fun sir. Pass on my regards to our coomunity folks in SGF. Cheers!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hello @gorba004 and welcome to the SAS Support Communities!

First of all, please note that your suggested WHERE condition is inconsistent with your requirement: You would need to replace "AND" with "OR" to correct this. (The "<>" works like "NE" *in a WHERE condition*, but I still wouldn't use it for the reason @ballardw has explained.)

As has been suggested, you can use functions with an OF *variable-list* argument to simplify the condition, where *variable-list* may be a list of arbitrary variable names (it's not limited to numbered range lists etc.). If you need more than one function, you can use an array to avoid duplicating the long list.

Example:

```
data new_set;
array x[*] list your variables here;
set old_set;
if not(nmiss(of x[*])=sumabs(of x[*])=0);
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

`if not(nmiss(of x[*])=sumabs(of x[*])=0);`

Super slick

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks so much, this worked like a charm

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Using array makes life easier:

```
data want;
set have;
array varx var1-var30;
do i=1 to dim(varx);
if varx(i) <> 0 then do; output; leave; end;
end;
drop i ;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You want to delete a row of variables each having a value of zero. Don't you want to delete Columns having zeros?

For instance,

data have; input var1-var3; cards; 0 0 0 0 1 0 0 1 2 ; run;

If you delete the first row, you will have the entire column (VAR1) with zeros. Do you want VAR1 to be kept?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

```
data have;
input var1-var3;
cards;
0 0 0
0 1 0
0 1 2
;
data want;
set have;
array x{*} var1-var3;
if min(of x{*})=0 and max(of x{*})=0 then delete;
run;
```

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.