I need to calculate outlier (2 standard deviation). To get the desire output, I want to calculate mean and standard deviation by using the group and condition.
If condition match, I want to assign that value for group (or I want to use that value to calculate lower and upper limit)
Lower_limit = mean - 2 * sd
Upper_limit = mean + 2 * sd
If var2 is less than lower limit or greater than upper limit than I want to delete that observation from my dataset.
This is what I want:
var1
var1 | var2 | mean | sd | lower_limit | upper_limit |
b | 1 | 3.4 | 3.97 | -4.54 | 11.34 |
b | 10 | 3.4 | 3.97 | -4.54 | 11.34 |
b | 2 | 3.4 | 3.97 | -4.54 | 11.34 |
b | 4 | 3.4 | 3.97 | -4.54 | 11.34 |
b | 0 | 3.4 | 3.97 | -4.54 | 11.34 |
c | 3 | . | . | . | . |
c | 3 | . | . | . | . |
c | 2 | . | . | . | . |
c | 1 | . | . | . | . |
c | 1 | . | . | . | . |
In this example, I do not have any demand less than lower limit or greater than upper limit. So I need to keep all observation.
data have;
input var1 var2;
datalines;
b 1
b 10
b 2
b 4
b 0
c 3
c 3
c 2
c 1
c 1
;
run;
proc sql;
create table want as
select*, avg(var2) as mean format 10.3, std(var2) as sd format 10.3
from have
group by var1
having var2 > 4;
quit
*cannot get the desire (same) value by group.
*delete observations based on lower and upper limit creteria.
data want;
set want;
lower_limit = mean - (2*sd);
upper_limit = mean + (2*sd);
if demand = . then output;
if demand < lower_limit then delete;
if demand > upper_limit then delete;
run;
What is your question?
How to calculate mean and standard deviation based on the condition by group?
data have;
input group $ var2;
datalines;
b 1
b 10
b 2
b 4
b 0
c 3
c 3
c 2
c 1
c 1
;
proc summary nway data=have;
class group;
var var2;
output out=_stats_ mean=mean stddev=sd;
run;
data want;
if _n_=1 then set _stats_(drop=_:);
set have;
/* If you want the confidence limits, you type the formula here */
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.