DATA Step, Macro, Functions and more

Recording for missing data / missing values

Accepted Solution Solved
Reply
Contributor
Posts: 27
Accepted Solution

Recording for missing data / missing values

Hi Dear SAS Community:

 

I am trying to write some SAS code that can distinguish between two types of patterns of missing data that can occur in a single record, and treat them differently for the respective record. 

 

The hypothetical dataset is what commonly occurs in the context of exams that have multiple-choice questions that are each scored 1=correct or 0=incorrect. For example, say an exam of just 5 questions taken by 6 people, to make things easy. For several records of the dataset, there are two types of missing data patterns that occur:

 

Pattern 1 - missing data due to skipping over earlier exam questions with later questions having responses. Skipped over questions should be coded to 0 to be treated as incorrect. 

 

Pattern 2 - missing data due to not reaching questions with the person responding to one or more earlier questions, but all following questions having no responses because the person did not reach them. The not reached questions should be coded to the SAS system missing value.

 

Below is the hypothetical dataset I have:

 

10.1 .

11111

11. . .

0. . .1

.1 . .0

. . . . .

 

Following is the dataset I want after addressing the two patterns of missing data for particular records.

 

1001.

11111

11 . . .

00001

01000

. . . . .

 

Can anyone please provide some SAS code that will work to produce the dataset I want from the dataset I have, both shown above?

 

Thanks in advance!

 

Aaron


Accepted Solutions
Solution
4 weeks ago
Super User
Posts: 6,759

Re: Recording for missing data / missing values

[ Edited ]

As a set of numeric values, the syntax changes a little.  Let's call the variables V1 through V6:

 

data want;

set have;

array v {6};

do k=6 to 1 by -1;

   if v{k} in (0, 1) then reset_flag='Y';

   else if v{k} = . and reset_flag='Y' then v{k} = 0;

end;

drop reset_flag;

run;

 

 

View solution in original post


All Replies
Super User
Posts: 6,759

Re: Recording for missing data / missing values

It looks like you have assembled a single character variable holding all the answers/results.  To fix that:

 

data want;

set have;

do k=length(answers) to 1 by -1;

   if substr(answers, k, 1)  in ('0', '1') then reset_flag='Y';

   else if substr(answers,k,1) = '.' and reset_flag='Y' then substr(answers,k,1) = '0';

end;

drop reset_flag;

run;

Contributor
Posts: 27

Re: Recording for missing data / missing values

Posted in reply to Astounding

Thanks, this is an interesting solution. It does work. But, it deviates from how my data is imported into SAS. The exam questions are imported as numerical variables. Any ideas about what the solution would be for numerical variables?

 

Kind Regards,

Aaron

Solution
4 weeks ago
Super User
Posts: 6,759

Re: Recording for missing data / missing values

[ Edited ]

As a set of numeric values, the syntax changes a little.  Let's call the variables V1 through V6:

 

data want;

set have;

array v {6};

do k=6 to 1 by -1;

   if v{k} in (0, 1) then reset_flag='Y';

   else if v{k} = . and reset_flag='Y' then v{k} = 0;

end;

drop reset_flag;

run;

 

 

Contributor
Posts: 27

Re: Recording for missing data / missing values

Posted in reply to Astounding

Bravo Astounding!!

 

Your approach is the correct solution. I did have to clean it up slightly, but well done still and I appreciate the assistance. Please see below your cleaned up solution. 

 

data have;
input item1 1 item2 2 item3 3 item4 4 item5 5;
CARDS;
10.1.
11111
11...
0...1
.1..0
.....
;
run;

 

data want;
set have;
array v {5} item1-item5;
do k=5 to 1 by -1;
if v{k} in (0, 1) then reset_flag='Y';
else if v{k} = . and reset_flag='Y' then v{k} = 0;
end;
drop reset_flag k;
run;

Super User
Posts: 13,517

Re: Recording for missing data / missing values

Depending on what you normally do with the values of the given variables you might instead want to consider using special missing values. The values would then be excluded from calculations such as totals or means or as part of a denominator for calculating percentages but could be printed or examined with a format to indicate such.

 

data example;
    input record x y;
    /* set specific missing values just as an example*/
    if record=2 then x= .S;
    if record=3 then y= .S;
    if record=4 then do;
       x=.I;
       y=.I;
   end;
datalines ;
1 3 18
2 .  2
3 7  .
4 .  .
;
run;

proc means data=example mean sum std;
   var x y;
run;

proc freq data=example;
   tables x y;
run;

Proc format library=work;
   value Special
   .S='Skip Pattern'
   .I='Incomplete'
   ;
run;

proc print data=example;
   format x y special.;
run;

Or possibly just the incomplete records to differentiate from other forms of missing.

 

Contributor
Posts: 27

Re: Recording for missing data / missing values

Thanks for your BallardW.

 

Aaron

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 86 views
  • 2 likes
  • 3 in conversation