How do I track missing vs values that mean missing

urban58 · Posted 03-03-2021 02:03 PM

Hello SAS users,

I have the following data

no values and 99=missing
max=10

I need to keep track of tests with
values=99
values=10 - I think the code below does that
mean tests with and without 10

value
max 0='0-<10'
1='10'
2='99' ;
data have ;
input id test1_trial1 test1_trial2 test1_trial3 test2_trial1 test2_trial2 test2_trial3;
infile datalines truncover;
datalines;
1 6.750 6.400 6.867 8.165 8.900 8.300
2 99.000 99.000 99.000 99.000 99.000 99.000
3 2.235 1.605 1.640 1.810 1.815 1.855
4 3.870 3.075 3.060
5 10.000 10.000 10.000 10.000 10.000 10.000
6 2.760 2.535 2.970 2.760 2.700 3.275
7 3.945 3.075 3.065 10.000 10.000 10.000
8 2.050 1.945 1.560 3.025 3.005 3.235
9 2.965 2.635 2.005 4.675 4.835 4.460
10 2.525 1.840 1.580 4.335 3.465 2.465
11 1.265 0.825 1.110
12 1.650 2.165 1.870 99.000 99.000 99.000
13
;
run;

data want;
set have;

if nmiss(of test1_trial1 - test1_trial3) =3 then test1=. ;
else if max(of test1_trial1 - test1_trial3) <99 then test1 = mean(of test1_trial1 -test1_trial3) ;
else test1 =.N ;

if nmiss(of test2_trial1 -test2_trial3) =3 then test2= .;
else if max(of test2_trial1 -test2_trial3) <99 then test2 = mean(of test2_trial1 -test2_trial3) ;
else test2 =.N ;

array tt1 test1_trial1 -test1_trial3;
array tt2 test2_trial1 -test2_trial3;
do over tt1 ;
if tt1 =99.000 then tt1 =.N; *how else can I code this?;
if tt2=99.000 then tt2 =.N ;
end;

test1 = mean(of test1_trial1 -test1_trial3) ;
test2 = mean(of test2_trial1 -test2_trial3) ;

if test1 =10 then testm1 = 1 ; else if test1 = .N then testm1 =2 ;
else if test1 ne . then testm1 =0;
if test2 =10 then testm2 = 1 ; else if test2 = .N then testm2 =2 ;
else if test2 ne . then testm2 = 0;

format testm1 testm2 max.
end;

Appreciate any help with my questions and in making what what I have done more concise, if possible.

Thanks,

Margaret

Astounding · Posted 03-03-2021 03:26 PM

Why would you keep the top portion of the DATA step at all? Why not just get rid of this part and let the rest of the DATA step do the work?


if nmiss(of test1_trial1 - test1_trial3) =3 then test1=. ;
else if max(of test1_trial1 - test1_trial3) <99 then test1 = mean(of test1_trial1 -test1_trial3) ;
else test1 =.N ;

if nmiss(of test2_trial1 -test2_trial3) =3 then test2= .;
else if max(of test2_trial1 -test2_trial3) <99 then test2 = mean(of test2_trial1 -test2_trial3) ;
else test2 =.N ;

urban58 · Posted 03-03-2021 03:33 PM

Except when tracking 99's (investigators want to know frequencies), it will be treated as missing

ballardw · Posted 03-03-2021 04:11 PM

@urban58 wrote:
Except when tracking 99's (investigators want to know frequencies), it will be treated as missing

If you want to get frequencies of missing than Proc Freq (or Tabulate or Report) will show counts for special missing if asked properly. Generally just tell the procedure to include missing as a level.

data example;
   input x;
datalines;
1
2
3
.
.N
2
1
2
;


Proc freq data=example;
   tables x / missing;
run;


Proc tabulate data=example;
   class x /missing;
   table x, n;
run;

ballardw · Posted 03-03-2021 03:32 PM

I don't think you have quite clearly described what you are doing. At least I can't tell.

Maybe use fewer records and variables so you do calculations by hand and then show what you expect the result to be for that reduced data.

I suggest that you describe precisely what the issue may be with :

if tt1 =99.000 then tt1 =.N; *how else can I code this?;
if tt2=99.000 then tt2 =.N ;

You can use 28 "missing" values, ., ._ and .A through .Z

If you need to count the special missing like .N differently than simple . then you can test for that specifically.

Here are a couple of examples:

data example;
   a=.;
   b=.A;
   c=.Z;
   if b in (., .Z) then put "B is . or .Z";
   if b in (.,.A) then put "B is . or .A";

   if b ne a then put "Not equal";
run;

Tom · Posted 03-03-2021 03:43 PM

What is the difference in meaning between 99 and missing?

Do these three data lines mean something different? If so can you explain the difference in words?

4 3.870 3.075 3.060 99 99 99
4 3.870 3.075 3.060 99
4 3.870 3.075 3.060

urban58 · Posted 03-04-2021 11:15 AM

Sorry my explanation is not clear.

99 means the subject would not do the test(s)

when trial data for test1 and/or test2 is missing (no value) the subject could be ineligible, tester did not do the test or something else

I need to report frequencies of subjects who have tests with 10's and 99's (so far if 1 test trial =10/99 then all trial data for that test is the same)

Test1 - see example wanted output

testm1	Frequency	Cumulative Frequency
.	1	1
0='0-<10'	10	11
1='10	1	12
2='99'	1	13

For want2 - analysis (means, mean comparisons by tester, etc.), I need to

Make 99’s missing
Be able to do a sensitivity analysis by including/excluding subjects with tests= 10

Maybe this?

proc means;

where testm1=0;* excludes 10’s;

proc means;

where testm1=1;* includes 10’s;

does that help?

Tom · Posted 03-04-2021 12:01 PM

I don't understand what TESTM1 means and what the table is counting.

Is it counting individual trial results?

Is it counting each TEST only once? So it has some type of summary of the 3 trial replicates at the TEST level?

Is it counting each individual only once no matter how many tests or trial replicates were done? If so how is the individual summarized?

How do I track missing vs values that mean missing

Re: How do I track missing vs values that mean missing

Re: How do I track missing vs values that mean missing

Re: How do I track missing vs values that mean missing

Re: How do I track missing vs values that mean missing

Re: How do I track missing vs values that mean missing

Re: How do I track missing vs values that mean missing

Re: How do I track missing vs values that mean missing