BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bzubrick
Obsidian | Level 7

I have a SAS data set which represents data on mortgages in the US. I want to make a historgram which shows the distribution of these loan amounts. I use the code below to run it, but the histogram that comes out is all skew'd to the left and is not helpful. Why is there so much blank space and how can i fix this???? I just want to take the bars on the left and spread them out accross the chart so people can see which values they correspond with.

 

histogram fail.png

title 'Analysis of Loan Amount';
ods graphics off;
proc univariate data=work.SampleData noprint;
   histogram Amount;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

One quick and dirty approach would be to just trim out the high end at some "reasonable" value.

 

proc univariate data=work.SampleData (where=(amount le 2000000) noprint;

 

for instance would discard values greater than 2,000,000.

 

Another option would be to use Proc Sgplot with histogram plot and specify a binwidth that you want such as 10000 or 20000. You may still need to remove the outliers as the maximum number of bins that are allowed are 10,000. Since you apparently are getting some value near 20,000,000 your minimum bidwidth would have to be larger than 2000.

View solution in original post

2 REPLIES 2
Reeza
Super User

Have you tried changing the bins?

 

It's skewed because you have outliers, can you group them into an "other" category, ie everything > 3M is in one bucket?

ballardw
Super User

One quick and dirty approach would be to just trim out the high end at some "reasonable" value.

 

proc univariate data=work.SampleData (where=(amount le 2000000) noprint;

 

for instance would discard values greater than 2,000,000.

 

Another option would be to use Proc Sgplot with histogram plot and specify a binwidth that you want such as 10000 or 20000. You may still need to remove the outliers as the maximum number of bins that are allowed are 10,000. Since you apparently are getting some value near 20,000,000 your minimum bidwidth would have to be larger than 2000.

Catch up on SAS Innovate 2026

Nearly 200 sessions are now available on demand with the SAS Innovate Digital Pass.

Explore Now →
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 3370 views
  • 4 likes
  • 3 in conversation