BookmarkSubscribeRSS Feed
Ranjeeta
Pyrite | Level 9

PROC boxplot DATA = Costing ;
plot IDC*?;
RUN;

 

Hello 

I want to create a boxplot to identify outliers in my data 

What would I specify as the unit in the plot statement?

 

 

5 REPLIES 5
novinosrin
Tourmaline | Level 20

Switch to SGPLOT

proc sgplot data=costing;
   vbox IDC;
run;
Ranjeeta
Pyrite | Level 9
Summary of IDC Variable

The UNIVARIATE Procedure
[Histogram for IDC]


________________________________
Summary of IDC Variable

The UNIVARIATE Procedure
Fitted Normal Distribution for IDC (IDC)
Parameters for Normal Distribution
Parameter
Symbol
Estimate
Mean
Mu
2627.593
Std Dev
Sigma
1840.645

Goodness-of-Fit Tests for Normal Distribution
Test
Statistic
p Value
Kolmogorov-Smirnov
D
0.1759673
Pr > D
<0.010
Cramer-von Mises
W-Sq
2.7372742
Pr > W-Sq
<0.005
Anderson-Darling
A-Sq
15.7992854
Pr > A-Sq
<0.005

Quantiles for Normal Distribution
Percent
Quantile
Observed
Estimated
1.0
0.40000
-1654.389
5.0
724.81000
-400.000
10.0
1161.90000
268.710
25.0
1665.96000
1386.096
50.0
2298.76500
2627.593
75.0
3007.88000
3869.089
90.0
4255.46000
4986.475
95.0
5150.47000
5655.185
99.0
10884.29000
6909.574



________________________________
[The SGPlot Procedure]

Does this mean that so many obs are out;iers that are above the boxplot?
novinosrin
Tourmaline | Level 20

I would call upon @Reeza  , @PaigeMiller  for best advice possible

PaigeMiller
Diamond | Level 26

@Ranjeeta wrote:

PROC boxplot DATA = Costing ;
plot IDC*?;
RUN;

 

Hello 

I want to create a boxplot to identify outliers in my data 

What would I specify as the unit in the plot statement?

 

 


proc univariate data=costing plots;
var idc;
run;

The PLOTS option causes a boxplot to be generated.

--
Paige Miller
FreelanceReinh
Jade | Level 19

Hello @Ranjeeta,


@Ranjeeta wrote:

PROC boxplot DATA = Costing ;
plot IDC*?;
(...)

What would I specify as the unit in the plot statement?


You mean the group variable. I think you would need to create a new variable with an arbitrary constant value in all observations (if there is no such variable in dataset Costing). This may be numeric or character (but not missing) and you could create it "on the fly" in a view: 

data _tmp / view=_tmp;
set costing;
retain _c 0;
run;

proc boxplot data=_tmp;
plot IDC*_c / boxstyle=schematic;
run;

Note that by default (BOXSTYLE=SKELETAL) outliers would not be identified with a special symbol.

 

The alternative solutions that have been suggested are easier to use: You don't need a group variable and the "schematic" plot style is the default.

 

With PROC UNIVARIATE the box plot comes in a panel together with other plots, which may or may not be convenient. The VBOX statement of PROC SGPLOT and the PLOT statement of PROC BOXPLOT are (in recent SAS releases) more flexible in terms of outlier definition: see options WHISKERPCT= and WHISKERPERCENTILE=, respectively. Note that in addition to exploratory analyses using these definitions or the default (that is: values outside the lower and upper fences) there exist many statistical tests for specific situations (e.g. tests for a lower and upper outlier-pair in a normal sample with unknown parameters, see statistical literature such as Barnett/Lewis).

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 482 views
  • 1 like
  • 4 in conversation