Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- UNIVARIATE - restrict computed statistics to what is needed?

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 03-20-2019 07:04 AM
(839 views)

Hi all,

I have a dataset with approximately 5.5 million observations of 3 numeric variables.

All I need are the means and a few quantiles for all 3 variables stored in a table.

I am currently running the following code and getting the desired result:

```
proc univariate data=MYDATA outtable=UNIV (keep=_var_ _min_ _p5_ _q1_ _median_ _mean_ _q3_ _p95_ _max_) noprint;
run;
```

This, however, seems like a waste of resources as it computes a number of statistics which I then drop immediately, and it also generates a warning about the number of observations being too large to calculate *Qn*, which I do not need here.

`The number of nonmissing observations for variable X is too large to compute the robust measure of scale Qn. The statistic Qn is set to missing.`

In sum, the code does what I want, but makes a number of unnecessary computations. In order to save time and resources, and also just out of interest, I was wondering whether there was any way to restrict the computations to a list of explicitly requested statistics.

Thanks in advance for your expertise.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

```
ods select none;
ods output summary=want;
proc means data=sashelp.heart min p5 p25 median mean p75 p95 max stackodsoutput;
var _numeric_;
run;
ods select all;
proc print;run;
```

8 REPLIES 8

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@Vogel wrote:

Hi all,

I have a dataset with approximately 5.5 million observations of 3 numeric variables.

All I need are the means and a few quantiles for all 3 variables stored in a table.

I am currently running the following code and getting the desired result:

`proc univariate data=MYDATA outtable=UNIV (keep=_var_ _min_ _p5_ _q1_ _median_ _mean_ _q3_ _p95_ _max_) noprint; run;`

This, however, seems like a waste of resources as it computes a number of statistics which I then drop immediately, and it also generates a warning about the number of observations being too large to calculate

Qn, which I do not need here.

Then use PROC MEANS or PROC SUMMARY and you can control which statistics are computed.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

```
ods select none;
ods output summary=want;
proc means data=sashelp.heart min p5 p25 median mean p75 p95 max stackodsoutput;
var _numeric_;
run;
ods select all;
proc print;run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi again,

First of all thanks, the layout of the output of your example code is exactly what I want!

The only remaining issue I now have, is it's hard to reproduce the same layout when using an output dataset, i.e. one variable that contains the input data variable name, and further variables for each of the computed statistics.

Is there any way to achieve this directly from a proc means / summary, or do I basically need to transform the output data myself to replicate that layout?

Thanks in advance.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@Vogel wrote:

Hi again,

First of all thanks, the layout of the output of your example code is exactly what I want!

The only remaining issue I now have, is it's hard to reproduce the same layout when using an output dataset, i.e. one variable that contains the input data variable name, and further variables for each of the computed statistics.

Is there any way to achieve this directly from a proc means / summary, or do I basically need to transform the output data myself to replicate that layout?

Thanks in advance.

This is not clear. What do you mean by "same layout"? Please explain further, or better yet, show us an example of what you want.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

OK, sorry about that.

I'm working in Enterprise Guide 7.15 HF3. When I run the following:

```
proc means data=sashelp.heart min p5 p25 median mean p75 p95 max;
var _numeric_;
run;
```

The report shows the output I'm attaching to this post in an Excel file.

My question is whether, using an output statement to create an output dataset, I can get the results in a similar layout directly from proc means.

So the resulting dataset would have one observation per numeric variable in SASHELP.HEART, Character variables for the variable names and labels, and Numeric variables for the requested statistics (which are the same for all variables).

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Some of us will not (or cannot) open Microsoft Office documents because they are a security risk.

Paste a portion of the output from PROC MEANS into the window that appears when you click on the {i} icon.

My question is whether, using an output statement to create an output dataset, I can get the results in a similar layout directly from proc means.

Please show us what you want.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Did you open table WANT ? Is that you want ?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Yes! I don't know how on earth I missed that. Thanks again and sorry for the confusion.

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.