BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Junyong
Quartz | Level 8

I am drawing multiple histograms using PROC UNIVARIATE as follows. Drawing the histograms, I impose ENDPOINTS to limit the domain and better compare the resulting histograms. Here is the working example.

data temp;
do s=1 to 10;
do i=1 to 5000;
x=rand("t",3);
output;
end;
end;
run;
ods listing gpath='%SystemDrive%\Users\%USERNAME%\Desktop\';
ods graphics on;
axis1 order=(0 to 0.25 by 0.05) minor=none;
proc univariate;
var x;
by s;
histogram/normal(mu=0,sigma=1)
    vaxis=axis1
    vscale=proportion
    endpoints=-5 to 5 by 0.5;
run;
ods graphics off;
quit;

Sadly, SAS ignores ENDPOINTS when there are outliers. The code above spits out the following unstable images with respective domains instead.

Histogram.pngHistogram1.pngHistogram2.png

Is WHERE -5<x<5 the only way here? Can I rather force ENDPOINTS to work and detour WHERE? Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

If the main concern is the histogram fitting in your desired range:

proc univariate data=temp noprint;
where -5 le x le 5;
var x;
by s;
histogram/normal(mu=0,sigma=1 noprint)
    vaxis=axis1
    vscale=proportion
    endpoints=-5 to 5 by 0.5;
run;

If you want the tabular summary to reflect the full data then do not use the histogram in one call.

 

The above example does not print the tables associated with the proc or the histogram. Remove the NOPRINT if you want the tables.

Be aware that the tables will be filtered by the WHERE statement.

 

Or use Proc SGPLOT or SGPANEL where there are more controls available

 

 

View solution in original post

7 REPLIES 7
Reeza
Super User

What does the log say?

 

For what you want, you may need SGPLOT instead.

 


@Junyong wrote:

I am drawing multiple histograms using PROC UNIVARIATE as follows. Drawing the histograms, I impose ENDPOINTS to limit the domain and better compare the resulting histograms. Here is the working example.

data temp;
do s=1 to 10;
do i=1 to 5000;
x=rand("t",3);
output;
end;
end;
run;
ods listing gpath='%SystemDrive%\Users\%USERNAME%\Desktop\';
ods graphics on;
axis1 order=(0 to 0.25 by 0.05) minor=none;
proc univariate;
var x;
by s;
histogram/normal(mu=0,sigma=1)
    vaxis=axis1
    vscale=proportion
    endpoints=-5 to 5 by 0.5;
run;
ods graphics off;
quit;

Sadly, SAS ignores ENDPOINTS when there are outliers. The code above spits out the following unstable images with respective domains instead.

Histogram.pngHistogram1.pngHistogram2.png

Is WHERE -5<x<5 the only way here? Can I rather force ENDPOINTS to work and detour WHERE? Thank you.


 

ballardw
Super User

If the main concern is the histogram fitting in your desired range:

proc univariate data=temp noprint;
where -5 le x le 5;
var x;
by s;
histogram/normal(mu=0,sigma=1 noprint)
    vaxis=axis1
    vscale=proportion
    endpoints=-5 to 5 by 0.5;
run;

If you want the tabular summary to reflect the full data then do not use the histogram in one call.

 

The above example does not print the tables associated with the proc or the histogram. Remove the NOPRINT if you want the tables.

Be aware that the tables will be filtered by the WHERE statement.

 

Or use Proc SGPLOT or SGPANEL where there are more controls available

 

 

Junyong
Quartz | Level 8

I think this is the only way. As you mentioned, I may need (1) the sample statistics from the full data and (2) the histograms from the partial data—so NOPRINT suppresses many unwanted numbers in (2). Thanks, but in (1), can I suppress unnecessary histograms? This is the example.

resetline;
dm"log;clear;output;clear;graph;end;odsresult;clear;";
option nodate nonumber ls=128 ps=max;
proc datasets lib=work kill nolist;
run;
data _01;
do i=1 to 5000;
x=rand("t",3);
output;
end;
run;
ods select none;
ods results=off;
ods output GoodnessOfFit=_02;
proc univariate data=_01;
var x;
histogram/normal(mu=0,sigma=1);
run;
ods results=on;
ods select all;
proc univariate data=_01 noprint;
var x;
where -5<x<5;
histogram/normal(mu=0,sigma=1,noprint) endpoints=-5 to 5 by 0.25;
run;
quit;

While the second UNIVARIATE just produces the histogram I need, the first UNIVARIATE produces both the full sample statistics and the ugly histogram. For the second one, I need to include HISTOGRAM/NORMAL(MU=0,SIGMA=1) to do Kolmogorov–Smirnov, Anderson–Darling, etc. with N(μ=0,σ²=1)—it seems NORMAL in the UNIVARIATE statement just selects the parameters automatically. Is there any similar way such as NOPRINT that suppresses not the tables but the histograms? Thanks.

ballardw
Super User

@Junyong wrote:

I think this is the only way. As you mentioned, I may need (1) the sample statistics from the full data and (2) the histograms from the partial data—so NOPRINT suppresses many unwanted numbers in (2). Thanks, but in (1), can I suppress unnecessary histograms? This is the example.

resetline;
dm"log;clear;output;clear;graph;end;odsresult;clear;";
option nodate nonumber ls=128 ps=max;
proc datasets lib=work kill nolist;
run;
data _01;
do i=1 to 5000;
x=rand("t",3);
output;
end;
run;
ods select none;
ods results=off;
ods output GoodnessOfFit=_02;
proc univariate data=_01;
var x;
histogram/normal(mu=0,sigma=1); <=Delete this line if you don't want a histogram. 
run;
ods results=on;
ods select all;
proc univariate data=_01 noprint;
var x;
where -5<x<5;
histogram/normal(mu=0,sigma=1,noprint) endpoints=-5 to 5 by 0.25;
run;
quit;

While the second UNIVARIATE just produces the histogram I need, the first UNIVARIATE produces both the full sample statistics and the ugly histogram. For the second one, I need to include HISTOGRAM/NORMAL(MU=0,SIGMA=1) to do Kolmogorov–Smirnov, Anderson–Darling, etc. with N(μ=0,σ²=1)—it seems NORMAL in the UNIVARIATE statement just selects the parameters automatically. Is there any similar way such as NOPRINT that suppresses not the tables but the histograms? Thanks.


Highlighting apparently doesn't work in "running man" code boxes so here:

proc univariate data=_01;
var x;
histogram/normal(mu=0,sigma=1); <=Delete this line if you don't want a histogram. 
run;
Junyong
Quartz | Level 8
I do need the line for the normality tests such as Kolmogorov–Smirnov. I just need to suppress those ugly histograms (byproducts) from the first UNIVARIATE as the second UNIVARIATE creates what I need.
Reeza
Super User
Use ODS SELECT <outputName> to control the output.
Rick_SAS
SAS Super FREQ

It's not clear to me how you want to handle the observations outside of [-5, 5].  If you want to omit them, then, yes, use the WHERE clause.

 

The documentation for the ENDPOINTS= option says "

The range of endpoints must cover the range of the data. For example, if you specify

endpoints=2 to 10 by 2

then all of the observations must fall in the intervals [2,4) [4,6) [6,8) [8,10].

"

 

If the data do not fall within the range of the endpoints, the endpoints list is extended until there is a bin for all observations.

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1912 views
  • 0 likes
  • 4 in conversation