## Select minimum Sample of cases to QC all conditions : use arrays?

Solved
Occasional Contributor
Posts: 16

# Select minimum Sample of cases to QC all conditions : use arrays?

Task: select a minimum sample of cases that taken together, provide at least one example of all specified conditions, for QC

For example

I might want to find at least one case where var1 is nonmissing.

I might need to see at least one case where variable, HHincome > 50,000

Purpose/context:

I'm working on an application that displays on the computer, information about a single person, with values populated by background datasets.  In order to verify that everything is working properly, I have to eye-ball at least one example that shows me that a date, for example, is displaying in the correct format, and at least one example that proves that a second email is showing up. If these are rare, I might have to look at two different cases.  But I have at least 100 variables or conditions of interest that occur rarely, and for reasons I won't get into, it is impractical to view/document over a hundred examples.

I was thinking that I could set up a series of arrays that keep track of which cases can be observed to verify that each condition is addressed. I could then analyze the info in the arrays to come up with a sample of cases that taken together, make if possible to do a complete QC.

Does that make sense? Do you have other ideas?

Thanks

.

Accepted Solutions
Solution
‎02-05-2018 09:25 AM
Super User
Posts: 6,629

## Re: Select minimum Sample of cases to QC all conditions : use arrays?

A different approach entirely ... take some real data and create the conditions you want to check.  For example:

data checkthis;

set have;

if _n_=1 then do;

var1=.;

output;

var1=5;

hhincome = 75000;

output;

end;

else if _n_=2 then do;

* create some additional conditions to check;

output;

end;

run;

All Replies
Solution
‎02-05-2018 09:25 AM
Super User
Posts: 6,629

## Re: Select minimum Sample of cases to QC all conditions : use arrays?

A different approach entirely ... take some real data and create the conditions you want to check.  For example:

data checkthis;

set have;

if _n_=1 then do;

var1=.;

output;

var1=5;

hhincome = 75000;

output;

end;

else if _n_=2 then do;

* create some additional conditions to check;

output;

end;

run;

Occasional Contributor
Posts: 16

## Re: Select minimum Sample of cases to QC all conditions : use arrays?

Thanks Astounding,

Your suggestion led to the following strategy for identifying cases that meet 3 infrequent conditions:

data qc;
set rcls;
if purpose_appt ne '' then do; qc1=1; output;end;
if purpose_gen ne '' then do;qc2=1; output;end;
if contact_mode =6 then do;qc3=1; output;end;
run;

proc print data=qc;
var sid2018 qc1 qc2 qc3 purpose_appt purpose_gen contact_mode ;
run;

SAS Output

Obs sid2018 qc1 qc2 qc3 purpose_appt purpose_gen Contact_Mode12345
 0107730020 1 Appointment 1 0107730030 1 Appointment 1 0109620010 1 General info 1 0109620010 1 Appointment 6 0109620010 1 1 Appointment 6

This shows me that

1. 0109620010 would be a good choice for review because it satisfies two of the conditions

2.0109620010 is also needed because condition 2 is not met in case 0109620010(or any other cases for that matter)

3. no need to check obs1, obs2, or obs4.

This is a great strategy for finding examples that meet conditions of interest for relatively rare conditions. It is exactly the type of thing I need.

Thanks!

Occasional Contributor
Posts: 16