## Exclude observations w/ missing values

Solved
Regular Contributor
Posts: 202

# Exclude observations w/ missing values

What's the most efficient way to exclude observations w/ missing values for multiple variables in a dataset?

I want to do something like this in my DATA step, but I get errors:

```		WHERE NOT MISSING (SEQN);
WHERE SAME AND NOT MISSING (DUQ:);```

There are 11 variables w/ a prefix of DUQ. I only want to include the observations that do NOT have missing values.

Thanks.

Accepted Solutions
Solution
‎07-19-2017 06:51 PM
Super User
Posts: 6,764

## Re: Exclude observations w/ missing values

Yes, NM is a new variable that counts the number of missing values for all those variables combined:  all the DUQ: variables, as well as SEQN.

Your later use of the WHERE statement is correct.  To get only observations with no missing values in that key set of variables:

where NM=0;

All Replies
Regular Contributor
Posts: 249

## Re: Exclude observations w/ missing values

try nmiss for multiple variables with missing values
MISSING! - Understanding and Making the Most of Missing Data

Regular Contributor
Posts: 202

## Re: Exclude observations w/ missing values

Thanks but I think the nmiss function "returns the number of missing values within a list of variables defined." Am I mistaken?

I don't want to count them, I want to exclude them. I want the final dataset to include only those observations with complete data...

Super User
Posts: 6,764

## Re: Exclude observations w/ missing values

It must be documented somewhere, but I'm finding in limited testing that variable lists cannot be used in a WHERE statement.

You can easily run this in a DATA step:

nm = nmiss(of duq: , seqn);

But you can't use:

where nmiss(of duq: , seqn) = 0;

For what it's worth, you could add the variable NM to your data set (using the formula above), and later use:

where nm = 0;

Or you could spell out all the names:

where nmiss(duq_01, duquess, duq_abc, seqn) = 0;

Regular Contributor
Posts: 202

## Re: Exclude observations w/ missing values

Posted in reply to Astounding

I like this approach, b/c it will allow me to run PROCs on datasets w/ complete information as well as on datasets w/o complete information. But I don't totally understand it.

<nm = nmiss(of duq: , seqn);>

Is this supposed to create a new variable - nm - that has a value of 0 when the value of duq: or seqn is NOT missing? All the values of nm are 0 when I run this...

Data want;

Set have;

where nm = 0 (i.e. Not missing);

Run;

Solution
‎07-19-2017 06:51 PM
Super User
Posts: 6,764

## Re: Exclude observations w/ missing values

Yes, NM is a new variable that counts the number of missing values for all those variables combined:  all the DUQ: variables, as well as SEQN.

Your later use of the WHERE statement is correct.  To get only observations with no missing values in that key set of variables:

where NM=0;

Super User
Posts: 8,092

## Re: Exclude observations w/ missing values

You cannot use variable lists in WHERE statements. But you can use them in IF statement or a subsetting IF statement.

``````data want ;
set have ;
if nmiss(of SEQN DUQ:) then delete;
...``````

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
• 6 replies
• 1037 views
• 4 likes
• 4 in conversation