- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I have a dataset with approximately 5.5 million observations of 3 numeric variables.
All I need are the means and a few quantiles for all 3 variables stored in a table.
I am currently running the following code and getting the desired result:
proc univariate data=MYDATA outtable=UNIV (keep=_var_ _min_ _p5_ _q1_ _median_ _mean_ _q3_ _p95_ _max_) noprint;
run;
This, however, seems like a waste of resources as it computes a number of statistics which I then drop immediately, and it also generates a warning about the number of observations being too large to calculate Qn, which I do not need here.
The number of nonmissing observations for variable X is too large to compute the robust measure of scale Qn. The statistic Qn is set to missing.
In sum, the code does what I want, but makes a number of unnecessary computations. In order to save time and resources, and also just out of interest, I was wondering whether there was any way to restrict the computations to a list of explicitly requested statistics.
Thanks in advance for your expertise.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
ods select none;
ods output summary=want;
proc means data=sashelp.heart min p5 p25 median mean p75 p95 max stackodsoutput;
var _numeric_;
run;
ods select all;
proc print;run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Vogel wrote:
Hi all,
I have a dataset with approximately 5.5 million observations of 3 numeric variables.
All I need are the means and a few quantiles for all 3 variables stored in a table.
I am currently running the following code and getting the desired result:
proc univariate data=MYDATA outtable=UNIV (keep=_var_ _min_ _p5_ _q1_ _median_ _mean_ _q3_ _p95_ _max_) noprint; run;
This, however, seems like a waste of resources as it computes a number of statistics which I then drop immediately, and it also generates a warning about the number of observations being too large to calculate Qn, which I do not need here.
Then use PROC MEANS or PROC SUMMARY and you can control which statistics are computed.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
ods select none;
ods output summary=want;
proc means data=sashelp.heart min p5 p25 median mean p75 p95 max stackodsoutput;
var _numeric_;
run;
ods select all;
proc print;run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi again,
First of all thanks, the layout of the output of your example code is exactly what I want!
The only remaining issue I now have, is it's hard to reproduce the same layout when using an output dataset, i.e. one variable that contains the input data variable name, and further variables for each of the computed statistics.
Is there any way to achieve this directly from a proc means / summary, or do I basically need to transform the output data myself to replicate that layout?
Thanks in advance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Vogel wrote:
Hi again,
First of all thanks, the layout of the output of your example code is exactly what I want!
The only remaining issue I now have, is it's hard to reproduce the same layout when using an output dataset, i.e. one variable that contains the input data variable name, and further variables for each of the computed statistics.
Is there any way to achieve this directly from a proc means / summary, or do I basically need to transform the output data myself to replicate that layout?
Thanks in advance.
This is not clear. What do you mean by "same layout"? Please explain further, or better yet, show us an example of what you want.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
OK, sorry about that.
I'm working in Enterprise Guide 7.15 HF3. When I run the following:
proc means data=sashelp.heart min p5 p25 median mean p75 p95 max;
var _numeric_;
run;
The report shows the output I'm attaching to this post in an Excel file.
My question is whether, using an output statement to create an output dataset, I can get the results in a similar layout directly from proc means.
So the resulting dataset would have one observation per numeric variable in SASHELP.HEART, Character variables for the variable names and labels, and Numeric variables for the requested statistics (which are the same for all variables).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Some of us will not (or cannot) open Microsoft Office documents because they are a security risk.
Paste a portion of the output from PROC MEANS into the window that appears when you click on the {i} icon.
My question is whether, using an output statement to create an output dataset, I can get the results in a similar layout directly from proc means.
Please show us what you want.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Did you open table WANT ? Is that you want ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content