Solved: array to count and set missing values in one step vs. multiple steps

awesome_opossum · Posted 05-23-2023 12:36 PM

I am trying to count the number of missing values on a set of vars. Then, if there are two or more vars with missing values, I want to set all the vars to have missing values. The following works:

data want (drop = i j k); set have; 
ct_pre = 0; 
ct_post = 0; 
array vars (7) var1-var7; 
do i = 1 to 7; 
	if vars{i} = . then ct_pre +1; 
end; 
do j = 1 to 7; 
	if ct_pre >= 2 then vars{j} = .; 
end; 
do k = 1 to 7; 
	if vars{k} = . then ct_post +1; 
end; 
run;

Thus I end up with ct_post having only values 0, 1, or 7, as desired.

However, I can't wrap my head around why this other, seemingly simpler approach doesn't work. Some variables do get set to missing in some cases, but other times they do not get set to missing. Thus, as in ct_pre, ct_post ends up with values 0-7, although a handful of the cases change in value. There does not appear to be a method to the madness, as in I don't see any strange patterns with specific vars. What does it have to do--I assume--with operating within only a single do/end space? I feel like I'm not understanding something fundamental, so if anyone can explain, I would appreciate it! Thanks!

data want (drop = i); set have; 
ct_pre = 0; 
ct_post = 0; 
array vars (7) var1-var7; 
do i = 1 to 7; 
	if vars{i} = . then ct_pre +1; 
	if ct_pre >= 2 then vars{i} = .; 
	if vars{i} = . then ct_post +1; 
end; 
run;

Tom · Posted 05-23-2023 12:51 PM

Because the second IF statement

if ct_pre >= 2 then vars{j} = .;

Is testing the value of CT_PRE at a different point in its development.

In the first step you wait until you have counted ALL of the elements in the first array. In the second one you are testing the value of CT_PRE before it has finished being calculated.

So get the count before the DO loop.

data want (drop = i); 
  set have; 
  array vars {7} var1-var7; 
  ct_pre = nmiss( of vars{*});
  if ct_pre >= 2 then do i = 1 to 7; 
    vars{i} = .; 
  end;
  ct_post = nmiss( of vars{*});
run;

View solution in original post

Tom · Posted 05-23-2023 12:51 PM

Because the second IF statement

if ct_pre >= 2 then vars{j} = .;

Is testing the value of CT_PRE at a different point in its development.

In the first step you wait until you have counted ALL of the elements in the first array. In the second one you are testing the value of CT_PRE before it has finished being calculated.

So get the count before the DO loop.

data want (drop = i); 
  set have; 
  array vars {7} var1-var7; 
  ct_pre = nmiss( of vars{*});
  if ct_pre >= 2 then do i = 1 to 7; 
    vars{i} = .; 
  end;
  ct_post = nmiss( of vars{*});
run;

PaigeMiller · Posted 05-23-2023 12:55 PM

@awesome_opossum wrote:

I am trying to count the number of missing values on a set of vars. Then, if there are two or more vars with missing values, I want to set all the vars to have missing values.

How about this:

data want;
    set have;
    n_missing=nmiss(of var1-var7);
    if n_missing>=2 then call missing(of var1-var7);
run;

--
Paige Miller

Reeza · Posted 05-23-2023 12:57 PM

This can be simplified as follows:

data want (drop = i j k); 

set have; 
ct_pre = 0; 
ct_post = 0; 
array vars (7) var1-var7; 

*number of missing values in the array;
ct_pre = nmiss(of vars(*));

*set all to missing if 2 or more are missing;
if ct_pre >= 2 then call missing(of vars(*));

*number of missing values after setting values to missing;
ct_post = nmiss(of vars(*));

run;

array to count and set missing values in one step vs. multiple steps

Re: array to count and set missing values in one step vs. multiple steps

Re: array to count and set missing values in one step vs. multiple steps

Re: array to count and set missing values in one step vs. multiple steps

Re: array to count and set missing values in one step vs. multiple steps

array to count and set missing values in one step vs. multiple steps

Re: array to count and set missing values in one step vs. multiple steps

Re: array to count and set missing values in one step vs. multiple steps

Re: array to count and set missing values in one step vs. multiple steps

Re: array to count and set missing values in one step vs. multiple steps

Registration is open

SAS Training: Just a Click Away