IQR with Multiple Variables

thb · Posted 09-11-2019 06:01 AM

Hello,

I am trying to identifyi outliers within my dataset with multiple by variables. Please see an example of the data below:

ID	Measure	Date	Numerator	Denominator
1	A	1-Jan	1	0
1	A	2-Jan	50	1
1	A	3-Jan	2	80
1	B	1-Jan	1	1
1	B	2-Jan	50	50
1	B	3-Jan	2	2
2	A	1-Jan	1	1
2	A	2-Jan	2	2
2	B	1-Jan	1	1
2	B	2-Jan	2	0
3	A	1-Jan	1	1
3	A	2-Jan	2	0
3	B	1-Jan	1	2
3	B	2-Jan	50	3
3	C	1-Jan	2	1
3	C	2-Jan	2	1

I'm trying to identify the outliers with the Numerators and Denominators by ID and Measure. So far, I have the following code, but it's not producing the desired results.

Also, is there a way to create a separate table with the 'n median qrange p25 p75' by ID and Measure?

Any assistance would be greatly appreciated. Thank you!

proc MEANS Data=have
n median qrange p25 p75;
var Numerator;
class ID Measure;
ods output summary=ranges;
run;




data Out;
  set have;
  Outlier = IFC(Numerator > (Numerator*3), 'Y','N');
run;

PaigeMiller · Posted 09-11-2019 06:44 AM

proc summary data=have;
    class id measure;
    var numerator denominator;
    output out=stats n= median= p25= p75=/autoname;
run;
data want;
    if _n_=1 then set stats;
    set have;
run;

--
Paige Miller

IQR with Multiple Variables

Re: IQR with Multiple Variables

Registration is open