Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- Proc Univariate help for a newbie

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 04-12-2018 02:06 PM
(1160 views)

I'm working with a very large dataset and it has a MONTH variable from 1-12,

and also a VALUE variable, with amounts possible for each month.

I would like to determine mean, median, etc. for a yearly (annual) amount,

summing each person's (observation) for 2 months of values.

What's the easiest way for a noob to do this?

Thanks in advance...

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@mikeed wrote:

For each ID, I'd would like to add up all the VALUES, from MONTH 1 to MONTH 12, for an annual VALUE.

Then I would like to find summary statistics of annual VALUE for each ID.

Perhaps this explains it better and it can't be accomplished in a single procedure, and that's my problem?

Yes, this is a two step problem so you can simply apply your proc twice.

The first time you're summing for the totals and in the second you're generating your summary statistics of the total value.

You can modify the statistics you get and the summary based on the statistics you specify in the PROC MEANS/SUMMARY statements.

You've been provided with multiple samples on how to run it for one, so you should be able to expand it to two sets of data.

But regardless, here's one way:

20 REPLIES 20

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I would like to determine mean, median, etc. for a yearly (annual) amount,

summing each person's (observation) for 2 months of values.

I''m afraid I'm not able to comprehend this part of the request. The top line makes perfect sense, but not in combination with the second line, which is rather cryptic.

Please show us a small amount of this data, and explain what results you'd like from this small amount of data.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

sorry, typo. all 12 months.

each person has a unique ID and MONTHCODE variable from 1-12.

I want to sum the VALUEs of MONTHCODE1-12 for each person, then determine mean/median annual VALUE.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I tend to use PROC SUMMARY for this, UNIVARIATE would also work but the code would be different

```
/* UNTESTED CODE */
proc summary data=have;
class id;
var value;
output out=want mean=meanvalue median=medianvalue;
run;
```

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

thanks,

would you be able to provide an example of the proc univariate code I would use?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

As I haven't used UNIVARIATE in years, my answer is that I can't, off the top of my head, provide UNIVARIATE code. It probably isn't much different, however you can read the documentation for PROC UNIVARIATE and see if you can figure it out.

The other problem is that UNIVARIATE is computing a huge amount of statistics that you haven't requested, and depending on much data you have, this could slow things down dramatically and produce a huge long output file.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Proc means, summary and univariate use similar processes so either should work for your request.

Why is PROC UNIVARIATE 'required'?

Here's a fully worked example of getting summary statistics using PROC MEAN:

https://github.com/statgeek/SAS-Tutorials/blob/master/proc_means_basic.sas

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I don't need to use Univariate, but I'm still having difficulty trying to figure out

how to determine the annual values of the summary statistics

Help still requested.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@mikeed wrote:

I don't need to use Univariate, but I'm still having difficulty trying to figure out

how to determine the annual values of the summary statistics

Help still requested.

If the code I gave is not working properly for you, then please explain what is happening that is wrong, and show us the SASLOG and results. Otherwise, I assume the problem has been solved.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

ID MONTH VALUE

1 1 65

1 2 17

. . .

1 11 47

1 12 99

2 1 55

2 2 98

. . .

2 11 45

2 12 18

3

...

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@mikeed wrote:

ID MONTH VALUE

1 1 65

1 2 17

. . .

1 11 47

1 12 99

2 1 55

2 2 98

. . .

2 11 45

2 12 18

3

...

NOW provide what the output is supposed to look like for that input.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

For each ID, I'd would like to add up all the VALUES, from MONTH 1 to MONTH 12, for an annual VALUE.

Then I would like to find summary statistics of annual VALUE for each ID.

Perhaps this explains it better and it can't be accomplished in a single procedure, and that's my problem?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@mikeed wrote:

For each ID, I'd would like to add up all the VALUES, from MONTH 1 to MONTH 12, for an annual VALUE.

Then I would like to find summary statistics of annual VALUE for each ID.

Perhaps this explains it better and it can't be accomplished in a single procedure, and that's my problem?

I provided code to do this in a single procedure already in the thread. Why do you discuss this as if it there is no such code?

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I'm sorry, your code was not detailed enough and did not work since it did not help me

tally the MONTHs that I needed to find the statistics for.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You can't just say "it didn't work". You have to give us details. You have to show us the SASLOG and the data set created, and explain why this is not the proper result.

--

Paige Miller

Paige Miller

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.