## How to assign values to the non-observed, or "legitimate skip"?

Frequent Contributor
Posts: 75

# How to assign values to the non-observed, or "legitimate skip"?

The title sounds vague, but here the data looks like:

data temp;

input ID NumberOfMembers Mem1Var1 Mem1Var2 Mem2Var1 Mem2Var2 Mem3Var1 Mem3Var2;

datalines;

ID01     1     1     0     .     .     .     .

ID02     2     1     .     0     1     .     .

ID03     3     0    1     1     0     0     0

;

ID represents a type of group variable, say, families. Variables from Mem1Var1 to Mem3Var2 are characteristics measured for individual members in each group. Missing values are assigned if there are not enough members of a group to fill in these characteristics. E.g., group ID01 consists of only 1 member, thus values for Mem2Var1, Mem2Var2, Mem3Var1 and Mem3Var2 are all 0s. Let's call these "mechanical missings".

There are other missing values due to data collection etc., such as Mem1Var2 of ID02. Let's call these "other missings".

Now how can I assign the "mechanical missings" with a specific value, let's say 9? So the expected data looks like:

ID01     1     1     0     9     9     9     9

ID02     2     1     1     0     1     9     9

ID03     3     .     1     1     0     0     0

If it matters, the reason for doing so is to distinguish this type of missings. They are not really "missing" as a consequence of, for example, an error in data collection or entry leading to the incompleteness of data. Instead, they don't have values because there are no observations for those values to be measured. I think this distinguishing is useful is some cases. For example, if you transpose the data to the long form by variable 2, it looks like this for ID02:

ID         Member          Var2          NumberOfMembers

ID02     1                         .                    2

ID02     2                         1                    2

ID02     3                         .                    2

So assigning specific values will distinguish difference of missing values in Var2 and the description of data will be more accurate.

Actually, as I'm typing this, I think we can assign mechanical missings based on long form data: IF NumberOfMembers < Member THEN Var2 = 9;

Not sure what to do with wide form though.

Super User
Posts: 7,042

## Re: How to assign values to the non-observed, or "legitimate skip"?

Posted in reply to NonSleeper

In addition to normal missing value (represented by a period) SAS also supports 27 special missing values (represented by period and a letter or underscore).

So use one of those to indicate the difference.

Since you have a count variable you could do it after the fact.

data temp ;

input ID NumberOfMembers Mem1Var1 Mem1Var2 Mem2Var1 Mem2Var2 Mem3Var1 Mem3Var2;

datalines;

ID01     1     1     0     .     .     .     .

ID02     2     1     .     0     1     .     .

ID03     3     0    1     1     0     0     0

;

data want;

set temp;

array matrix (3,2) mem: ;

do i=NumberOfMembers+1 to 3;

do j=1 to 2;

matrix(i,j)=.N;

end;

end;

run;

Super User
Posts: 10,028

## Re: How to assign values to the non-observed, or "legitimate skip"?

Posted in reply to NonSleeper

But the output doesn't look like what you mean .

### Code: Program

`data temp ;  input ID \$ NumberOfMembers Mem1Var1 Mem1Var2 Mem2Var1 Mem2Var2 Mem3Var1 Mem3Var2;datalines;ID01 1 1 0 . . . .ID02 2 1 . 0 1 . .ID03 3 0 1 1 0 0 0;run;proc stdize data=temp out=want reponly missing=9;run;`

Super User
Posts: 11,343

## Re: How to assign values to the non-observed, or "legitimate skip"?

Posted in reply to NonSleeper

To go along with Tom's answer:

Along with the custom missing value .A and such you can ( fairly highly recommended from my point of view) also assign  custom formats to describe the specific from of missing.

The following stub program reads in a question with 5 valid response categories and a missing (could be skip pattern somewhat similar to original question), assigns special missing to categories that will generally be ignored for MOST analysis, a format to display the meaning of the special missings, and an example with proc freq of analysis with and without missing values included.

proc format library=work;
invalue Q1Missing
7=.D
9=.R
1,2,3=_same_
other= .
;
value q1missing
.D ="Don't Know"
.R ="Refused"
1  ="Yes"
2  ="No"
3  ="Not sure"
.  = "Not Answered"
;
run;

data example;
input x q1missing.;
format x q1missing.;
datalines;
1
2
3
7
9
.
;
run;

Proc freq data=example;
tables x / missing;
tables x;
run;

Discussion stats
• 3 replies
• 357 views
• 3 likes
• 4 in conversation