BookmarkSubscribeRSS Feed
urban58
Quartz | Level 8

Hello SAS users,

I have the following data

no values and 99=missing
max=10

I need to keep track of tests with
values=99 
values=10 - I think the code below does that
mean tests with and without 10


value
max 0='0-<10'
1='10'
2='99' ;
data have ;
input id test1_trial1 test1_trial2 test1_trial3 test2_trial1 test2_trial2 test2_trial3;
infile datalines truncover;
datalines;
1 6.750 6.400 6.867 8.165 8.900 8.300
2 99.000 99.000 99.000 99.000 99.000 99.000
3 2.235 1.605 1.640 1.810 1.815 1.855
4 3.870 3.075 3.060
5 10.000 10.000 10.000 10.000 10.000 10.000
6 2.760 2.535 2.970 2.760 2.700 3.275
7 3.945 3.075 3.065 10.000 10.000 10.000
8 2.050 1.945 1.560 3.025 3.005 3.235
9 2.965 2.635 2.005 4.675 4.835 4.460
10 2.525 1.840 1.580 4.335 3.465 2.465
11 1.265 0.825 1.110
12 1.650 2.165 1.870 99.000 99.000 99.000
13
;
run;

data want;
set have;

if nmiss(of test1_trial1 - test1_trial3) =3 then test1=. ;
else if max(of test1_trial1 - test1_trial3) <99 then test1 = mean(of test1_trial1 -test1_trial3) ;
else test1 =.N ;

if nmiss(of test2_trial1 -test2_trial3) =3 then test2= .;
else if max(of test2_trial1 -test2_trial3) <99 then test2 = mean(of test2_trial1 -test2_trial3) ;
else test2 =.N ;

array tt1 test1_trial1 -test1_trial3;
array tt2 test2_trial1 -test2_trial3;
do over tt1 ;
if tt1 =99.000 then tt1 =.N; *how else can I code this?;
if tt2=99.000 then tt2 =.N ;
end;

test1 = mean(of test1_trial1 -test1_trial3) ;
test2 = mean(of test2_trial1 -test2_trial3) ;


if test1 =10 then testm1 = 1 ; else if test1 = .N then testm1 =2 ;
else if test1 ne . then testm1 =0;
if test2 =10 then testm2 = 1 ; else if test2 = .N then testm2 =2 ;
else if test2 ne . then testm2 = 0;

format testm1 testm2 max.
end;

 

Appreciate any help with my questions and in making what what I have done more concise, if possible.

 

Thanks,

Margaret

 

7 REPLIES 7
Astounding
PROC Star

Why would you keep the top portion of the DATA step at all?  Why not just get rid of this part and let the rest of the DATA step do the work?


if nmiss(of test1_trial1 - test1_trial3) =3 then test1=. ;
else if max(of test1_trial1 - test1_trial3) <99 then test1 = mean(of test1_trial1 -test1_trial3) ;
else test1 =.N ;

if nmiss(of test2_trial1 -test2_trial3) =3 then test2= .;
else if max(of test2_trial1 -test2_trial3) <99 then test2 = mean(of test2_trial1 -test2_trial3) ;
else test2 =.N ;
urban58
Quartz | Level 8
Except when tracking 99's (investigators want to know frequencies), it will be treated as missing
ballardw
Super User

@urban58 wrote:
Except when tracking 99's (investigators want to know frequencies), it will be treated as missing

If you want to get frequencies of missing than Proc Freq (or Tabulate or Report) will show counts for special missing if asked properly. Generally just tell the procedure to include missing as a level.

 

data example;
   input x;
datalines;
1
2
3
.
.N
2
1
2
;


Proc freq data=example;
   tables x / missing;
run;


Proc tabulate data=example;
class x /missing;
table x, n;
run;

 

ballardw
Super User

I don't think you have quite clearly described what you are doing. At least I can't tell.

Maybe use fewer records and variables so you do calculations by hand and then show what you expect the result to be for that reduced data.

 

 

I suggest that you describe precisely what the issue may be with :

 

if tt1 =99.000 then tt1 =.N; *how else can I code this?;
if tt2=99.000 then tt2 =.N ;

 

You can use 28 "missing" values, ., ._ and .A through .Z

If you need to count the special missing like .N differently than simple . then you can test for that specifically.

Here are a couple of examples:

data example;
   a=.;
   b=.A;
   c=.Z;
   if b in (., .Z) then put "B is . or .Z";
   if b in (.,.A) then put "B is . or .A";

   if b ne a then put "Not equal";
run;
Tom
Super User Tom
Super User

What is the difference in meaning between 99 and missing? 

Do these three data lines mean something different?  If so can you explain the difference in words?

4 3.870 3.075 3.060 99 99 99
4 3.870 3.075 3.060 99
4 3.870 3.075 3.060

 

 

 

urban58
Quartz | Level 8

Sorry my explanation is not clear.

99 means the subject would not do the test(s)

when trial data for test1 and/or test2 is missing (no value) the subject could be ineligible, tester did not do the test or something else

I need to report frequencies of subjects who have tests with 10's and 99's (so far if 1 test trial =10/99 then all trial data for that test is the same)  

Test1 - see example wanted output

testm1

Frequency

Cumulative
Frequency

.

1

1

0='0-<10'

10

11

1='10

1

12

2='99'   

1                       

 13

 

For want2 - analysis (means, mean comparisons by tester, etc.), I need to

  1. Make 99’s missing
  2. Be able to do a sensitivity analysis by including/excluding subjects with tests= 10

Maybe this?

proc means;

where testm1=0;* excludes 10’s;

 

proc means;

where testm1=1;* includes 10’s;

does that help?

 

Tom
Super User Tom
Super User

I don't understand what TESTM1 means and what the table is counting.

 

Is it counting individual trial results?

Is it counting each TEST only once?  So it has some type of summary of the 3 trial replicates at the TEST level? 

Is it counting each individual only once no matter how many tests or trial replicates were done?  If so how is the individual summarized?

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 636 views
  • 0 likes
  • 4 in conversation