BookmarkSubscribeRSS Feed
Rou
Obsidian | Level 7 Rou
Obsidian | Level 7

How do I set multiple variables in my dataset to missing instead of using the IF statement?

 

 

7 REPLIES 7
ballardw
Super User

Call missing ( var1, var2, othervar, ...);

as many variables as you want. One of the few functions that works with all variable types.

 

To execute conditionally:

 

If <some condition> then call missing(thisvar, thatvar, othervar);

 

I am not sure why you mention "IF". That is only required if you want to set a value conditionally. Admittedly you likely don't want to set a variable missing for every observation but if so "If" is not needed.

 

Sometimes (as in almost always) it does help to provide an example of the data you have and what you expect the result to look like.

Rou
Obsidian | Level 7 Rou
Obsidian | Level 7

The way I asked my question was wrong (sorry). I meant to ask how to remove multiple missing value at once: 

This what I have been doing but wondering if there is a better one to do it for the whole dataset. 

 

data newdata;

         set old data;

if promis_pa_scale1 = 99 then promis_pa_scale1= .;
if promis_pa_scale3 = 99 then promis_pa_scale3= .;
if promis_pa_scale4 = 99 then promis_pa_scale4= .;
if promis_pa_scale5= 99 then promis_pa_scale5= .;

run;

andreas_lds
Jade | Level 19

@Rou wrote:

The way I asked my question was wrong (sorry). I meant to ask how to remove multiple missing value at once: 

This what I have been doing but wondering if there is a better one to do it for the whole dataset. 

 

data newdata;

         set old data;

if promis_pa_scale1 = 99 then promis_pa_scale1= .;
if promis_pa_scale3 = 99 then promis_pa_scale3= .;
if promis_pa_scale4 = 99 then promis_pa_scale4= .;
if promis_pa_scale5= 99 then promis_pa_scale5= .;

run;


Description and code don't match.You are setting the variables to missing, if the value is 99. This is not removing multiple missing values.

ballardw
Super User

@Rou wrote:

The way I asked my question was wrong (sorry). I meant to ask how to remove multiple missing value at once: 

This what I have been doing but wondering if there is a better one to do it for the whole dataset. 

 

data newdata;

         set old data;

if promis_pa_scale1 = 99 then promis_pa_scale1= .;
if promis_pa_scale3 = 99 then promis_pa_scale3= .;
if promis_pa_scale4 = 99 then promis_pa_scale4= .;
if promis_pa_scale5= 99 then promis_pa_scale5= .;

run;


Still describing the problem incorrectly. You want to replace specific values with missing for multiple variables.

 

If the value is always 99 then an array solution is likely the simplest way to do the same thing to many variables. Place the names of the variables where you want to replace 99 with missing on the ARRAY statement where I have your variables after the Fix(*). You can lists of various types. This will only work for variables of the same type. So you cannot place variables that are character with numeric variables in an array.

 

For the truly lazy IF appropriate for your data one of the special lists is the keyword _numeric_ and another is _character_. These would create an array of all the numeric (or character) variables for processing. Which means all values of 99 for every numeric variable would be set to missing. So if you have variables where you do not want 99 replaced don't use that. If you have multiple different groups of variables, such as some where 99 is valid but 999 is not you could create separate arrays for each to process. Just name the array differently for each. Do not use the name of an existing variable for an Array name.

 

data newdata;
   set old data;
   array fix(*) promis_pa_scale1 promis_pa_scale3 promis_pa_scale4 promis_pa_scale5 ;
   do i=1 to dim(fix);
      if fix[i]=99 then fix[i]=.;
   end;
   drop i;
run;

Optionally you could replace the simple . with a special missing which are dot and a letter or the underscore character. Example .R (dot followed by R which I do for code values that mean "Refused to answer". The special missing are excluded from any numeric calculation and analysis that does not specifically include missing values. This may be desired if the variables could be missing for other reasons such as skip patterns in a survey, incomplete response, data entry error. The special missing does allow you to answer questions such as "How many refusals to question 27 were there?" in the presence of the other possible missing values.

 

Another somewhat more sophisticated approach is if data is read by a data step you can create custom informats that will read the data that way directly. Proc format to create a format that will read text of 99 (exactly) as missing (or special missing). The name of the informat is Drop99_  to be somewhat descriptive but you cannot name a format or informat with something that ends in a numeral so I end with the _ character.

Rough example:

Proc format;
invalue drop99_ 
99=.
other=[16.]
;
run;

data example;
   input var1 :drop99_. var2 :drop99_.;
datalines;
1 2
34.5678 99
99 99
99.4567  4
;
Kurt_Bremser
Super User

See Maxim 19.

I have a strong suspicion that you simply suffer from a bad dataset structure.

data olddata;
N = _N_;
input promis_pa_scale1 promis_pa_scale2 promis_pa_scale3 promis_pa_scale4 promis_pa_scale5;
datalines;
1 2 3 4 5
99 6 7 99 0
;

proc transpose data=olddata out=long (rename=(col1=promis_pa_scale));
by N;
var promis_pa_scale:;
run;

data want;
set long;
number = input(substr(_name_,16),10.);
if promis_pa_scale = 99 then promis_pa_scale = .;
drop _name_;
run;

N is a stand-in for whatever variable(s) you have that uniquely identify each row of the original dataset.

Reeza
Super User

CALL MISSING with a variable list if required.

 

call missing(age, sex, weight);

call missing( of var1-var8);

call missing( of var_list(*)); *array list;

 

Here is a reference that illustrates how to refer to variables and datasets in a short cut list:
https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html

 

 

Manvel
Obsidian | Level 7

You can use Call Missing(var1, var2 ...)

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 4294 views
  • 3 likes
  • 6 in conversation