DATA Step, Macro, Functions and more

How to include null bins in outhistogram data within proc univariate

Reply
Occasional Learner
Posts: 1

How to include null bins in outhistogram data within proc univariate

I am trying to create histograms with certain bins for multiple units. I then need to find the mean and standard errors of all the units at each particular bin. The problem I am having is that my OUTHISTOGRAM statement doesn't include the bins past the last bin I have data in. This is problematic when trying to find accurate means and standard errors as I need the data from ever bin from 0 to 5, even if there is 0 in that bin.

 

Here is the code I'm using:

proc univariate data=work.serotonin noprint;
histogram Unit1 - Unit25 / midpoints = 0 to 5 by .25
rtinclude
outhistogram = SerotoninHist;
run;

 Screen Shot 2018-01-19 at 11.59.42 AM.png

title "5-HT";
Proc means data=work.serotoninhist Mean STDERR;
class _MIDPT_;
var _OBSPCT_ ;
run;

 Screen Shot 2018-01-19 at 12.02.47 PM.png

I need the Proc Univariate outhistogram to include all bins from 0 - 5, which then needs to lead to the number of obs column in the Proc means output to read 25 for  each bin, allowing accurate means and standard errors.

 

I'm pretty new to SAS, so any help will be much appreciated!

 

I'm using SAS University edition.

Super User
Posts: 13,008

Re: How to include null bins in outhistogram data within proc univariate


BrettL wrote:

... I then need to find the mean and standard errors of all the units at each particular bin.

 

I need the Proc Univariate outhistogram to include all bins from 0 - 5, which then needs to lead to the number of obs column in the Proc means output to read 25 for  each bin, allowing accurate means and standard errors.

 

 


I have to say I do not understand exactly what you mean with this. I'm not sure exactly what "each particular bin" means without examples that are more concrete than the code.

 

Are you saying that you want obs to be 25 when there were not actually 25 observations in a group? Or

If you want to creates summaries by groups that do not exist in your data you'll likely need another approach.

And what would be an accurate mean or standard error?

 

Perhaps you could provide a small example data set maybe with 20 or so records and fewer midpoints so that you can hand calculate a mean (at least) and demonstrate what you would want for the actual output.

 

Also many times when I see variables such as your Unit1 to Unit25 (is that the 25 your are referring to?) often part of the solution is to normalize data. What exactly does Unit1 represent that is different than Unit2?

It may be that you should transpose the data so that instead of having 25 variables you have two variables: 1 that represents the unit and the other that has the value.

Then most of the summaries might be doable with BY group processing for the variable containing the unit number.

Ask a Question
Discussion stats
  • 1 reply
  • 79 views
  • 0 likes
  • 2 in conversation