Hi All,, I'm new to SAS and my task is to calculate shares of outliers for my date. I have already prepared the following code but stopped after joining table. Creating descriptive statistics title 'Descriptive statistics for each variable from "valid" dataset'; ods exclude _all_; proc means data=valid n min p1 p5 p10 q1 p50 q3 p90 p95 p99 max mean std qrange range STACKODSOUTPUT; var _numeric_ ; ods output summary=summary_valid; run; ods exclude none; Creating a table with outlier bounds data proc sql; create table outlier_valid as select variable, Q1-1.5*QRANGE as lb_mid, Q3+1.5*QRANGE as ub_mid, Q1-3*QRANGE as lb_extr, Q3+3*QRANGE as ub_extr from summary_valid order by variable desc, variable; quit; proc print data=outlier_valid noobs; title1 'The summary of outliers data valid'; title2 'lb_mid = lower bound for mid outliers'; title3 'ub_mid = upper bound for mid outliers'; title4 'lb_extr = lower bound for extreme outliers'; title5 'ub_extr = upper bound for extreme outliers'; run; Joining two above tables proc sql; create table train_join as select s.variable, s.n, s.MIN, s.P1, s.P5, s.P10, s.Q1, s.P50, S.Q3, s.P90, s.P95, s.P99, s.MAX, s.MEAN, s.STDDEV, s.QRANGE, s.RANGE, o.lb_mid, o.ub_mid, o.lb_extr, o.ub_extr from summary_train s, outlier_train o where s.variable = o.variable; quit; Now I want to create a variable representing count of outliers which: are lower than lb_mid are higher than up_mid are lower than lb_extr are higher than up_extr I would also like to create variable mild_outliers (true/false) and extrr_outliers (true/false) Could you please give me some tips? Kind regards, Cezary
... View more