BookmarkSubscribeRSS Feed
mhtoto
Fluorite | Level 6

I have patients (DOBID) each with two eyes.

 

If one of the eye measurements is an extreme (based on clinical/historical presedent) I need to exclude both eyes.

 

For instance:

 

DOBID   Measurement

1             10

1             20

2             12

2             17

3             22

3             18

 

if measurement>20 then outlier=1;

 

What code would I need to exlude both measurement values for DOBID=3 (not just the one where measurement=22)?

 

Thanks in advance.

4 REPLIES 4
ballardw
Super User

Do you have an extreme low value that would qualify as well? If not something like this may work:

 

/* get a maximum value for each id*/
proc summary data=have nway;
   class dobid;
   var measurement;
   output out= temp max=;
run;

proc sql;
   create table want as
   select b.* 
   from temp as left join have as b
      on a.dobid=b.dobid
   where a.measurement le 20;
quit;
mhtoto
Fluorite | Level 6

Thank you Ballardw for your reply.

 

Yes I did simplify my example.  In fact, I have multiple measurements...measurement1...measurement2...etc, each with a min and max value.  I have written code that finds each row that has an outlier value, and assigns a 1.

 

DOBID   Measurement   Measurement2    Outlier (1=yes)

1             10                     12                       0

1             20                     13                       0

2             12                      6                        1

2             17                      18                      0

3             22                      19                      1

3             18                      15                      0

 

In this example 22 and 6 are outliers, so DOBID 2 and 3 should be excluded (all measurements).

 

So using your reasoning, something like this? Unless there is a better method?  I've never used proc sql before...

 

 

/* get a maximum value for each id*/
proc summary data=have nway;
   class dobid;
   var outlier;
   output out= temp max=;
run;

proc sql;
   create table want as
   select b.*
   from temp as left join have as b
      on a.dobid=b.dobid
   where a.outlier lt 1;
quit;

 

LinusH
Tourmaline | Level 20

I think that you could solve this in a single SQL step, using group by, and having max(measurement) < 20.

If you have multiple measurements, just extend the having clause.

Data never sleeps
ballardw
Super User

"Better" could be based on a number of factors, so I won't make any claim to best.

Your exension is one way that I would approach the issue as assigning the outlier status for many variables would simplify the code. Though that, with RETAIN of the outlier would allow the selection with a single pass of the data. The approach I suggested came from something else I worked on that has more records per ID that would not have necessarily sorted.

 

The proc sql is very useful in combining to or more datasets as there are more options/approaches than data step merges.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1525 views
  • 0 likes
  • 3 in conversation