BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ANKH1
Pyrite | Level 9

Hi, 

I want to delete rows that have missing values for a list of consecutive variables. For example, from a the a dataset like:

id   cat var1 var2 var3 var4 stop

42  a 1 0 0 1 Y

43  b . . . .  Y

44  a 1 . 0 1 N

45 b  0 . . 1 Y

 

I want to delete row 43 which is missing data from var1 to var 4.

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

My apologies. It looks like you can't use a -- in a WHERE statement.  Which I guess makes sense, because it relies on the order of variables in the PDV.

 

It should work if you change to a subsetting IF, e.g.:

 

12   data want;
13     set have ;
14     if cmiss(OF var1 -- zvariable) < 5 ;  *IF not where;
15   run;

NOTE: There were 1 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 1 observations and 3 variables.
BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.

View solution in original post

14 REPLIES 14
PeterClemmensen
Tourmaline | Level 20

Does it have to be missing values for all variables from var1 to var4?

ANKH1
Pyrite | Level 9

Yes, missing all values from var 1 to 4.

PeterClemmensen
Tourmaline | Level 20

Try this

 

data have;
input id cat $ var1 var2 var3 var4 stop $;
datalines;
42 a 1 0 0 1 Y
43 b . . . . Y
44 a 1 . 0 1 N
45 b 0 . . 1 Y
;

data want;
   set have;
   where n(var1-var4);
run;
ANKH1
Pyrite | Level 9

Thank you. It didn't work. I forgot the add that the variable names can have different names. For example the names are var1 xvar2 yname3 zvariable. These are all numeric variables.

Quentin
Super User

What defines the range of variables?  

 

You can list them:

data want;
   set have;
   where n(var1, xvar2, yname3, zvariable) ;
run;

Or if you know the four variables are in order in the PDV you could use:

data want;
   set have;
   where n(of var1 -- zvariable) ;
run;

If you want to do it based on variable position (which is odd) I would run PROC CONTENTS to figure out the names of the variables in positions 1-4, then put that into a macro variable.

 

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
ANKH1
Pyrite | Level 9

Thanks! I had to list all variables like the first option you gave. And when I ran it, I realized that in the middle of the range there is a character variable. Say yname3 is a character and not numeric. How will you handle that?

Quentin
Super User

Do you want to include the character variable or exclude it?

 

If you want to exclude it, you can do:

where n(of var1 - numeric - zvariable) ;

That will only include the numeric variables in the range.

 

If you want to include the character variable, you can switch from the N function to the CMISS function, which can handle both numeric and character variables:

where cmiss(of var1 -- zvariable) < 5  ; *assuming there are 5 variables in this range;

 

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
ANKH1
Pyrite | Level 9

This is very helpful. I would want to include character variables as well. But "where" with cmiss() is not working for character variables. 

Quentin
Super User

What do you mean "not working"? Do you get an error, or the wrong result?

 

Please try making a little example like what @PeterClemmensen posted where there is a data step for making a HAVE dataset with a few records, and then a WANT step using CMISS.  If you get errors in the log, please post the log as well.  CMISS should work with both numeric and character variables. The documentation for CMISS is: https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lefunctionsref/p1tth4ltf640din1ey86ubo2lky2.h...

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
ANKH1
Pyrite | Level 9

This is the code: 

data want;
   set have;
   where cmiss(var1 -- zvariable) <5;
run;

Where zvaribale is a character variable. I get this error in the log:

 

ERROR: WHERE  clause operator requires numeric values.

Quentin
Super User

When you use the --  or - list as an argument to a function, you need to use the word OF at the beginning. Try:

 

data want;
   set have;
   where cmiss(OF var1 -- zvariable) < 5 ;
run;

If you don't write OF, SAS will do subtraction:

var1 - (-zvariable) 

The error you're getting is because SAS is trying to do subtraction with a character variable.

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
ANKH1
Pyrite | Level 9

Thanks for your reply. I added the word OF as advised and I am getting this error in the log:

 

Capture (1).PNG

Quentin
Super User

My apologies. It looks like you can't use a -- in a WHERE statement.  Which I guess makes sense, because it relies on the order of variables in the PDV.

 

It should work if you change to a subsetting IF, e.g.:

 

12   data want;
13     set have ;
14     if cmiss(OF var1 -- zvariable) < 5 ;  *IF not where;
15   run;

NOTE: There were 1 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 1 observations and 3 variables.
BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
ANKH1
Pyrite | Level 9

It worked! Thank you so much! This is extremely helpful.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 1573 views
  • 1 like
  • 3 in conversation