Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Multivariate regression by category or subgroup

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 11-18-2020 05:59 AM
(911 views)

I’m using a Multivariate regression to model the Total volume as a function Of two variables Units and Price.

However I’m not sure wether to model the total volume as a whole or whether to model it by subgroups of brand. How would I go about finding this out?

I have performed cluster analysis for the category as a whole and by brand but unsure how to interpret the results in order to answer this question

However I’m not sure wether to model the total volume as a whole or whether to model it by subgroups of brand. How would I go about finding this out?

I have performed cluster analysis for the category as a whole and by brand but unsure how to interpret the results in order to answer this question

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I think you are asking about a classical ANOVA model that asks whether the mean Volume differs according to the brand. All you need to do is use the CLASS statement to specify the brand variable, then include that variable in your model. FOr example, if you are using PROC GLM, the code looks like this:

```
proc glm data=Have plots=all;
class Brand;
model Volume = Units Price Brand;
quit;
```

5 REPLIES 5

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I think you are asking about a classical ANOVA model that asks whether the mean Volume differs according to the brand. All you need to do is use the CLASS statement to specify the brand variable, then include that variable in your model. FOr example, if you are using PROC GLM, the code looks like this:

```
proc glm data=Have plots=all;
class Brand;
model Volume = Units Price Brand;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Fabulous thank you, and if they do differ then I would need to model my regression by subgroups which is brand?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I have observed that 2 of my groups behave the same but 1 behaves differently. So I decided to use brand as a subgroup.

However doing this I’ve been told my R-square value should be close to 1 but it’s not it’s as low as 0.3 in certain cases. Does this matter?

However doing this I’ve been told my R-square value should be close to 1 but it’s not it’s as low as 0.3 in certain cases. Does this matter?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

In an ideal world, R-squared should be close to 1. But different data has different amounts of noise, and so a r-squared of 0.3 may be the proper value for this data. The real question in my mind would be to look at the root mean square error reported by SAS and decide if this is an acceptable level of variation (or not). If, for example, you have some idea of measurement variability or sample-to-sample variability, and the root mean square error is somewhat close, then I'd say that's fine. Or, if the confidence intervals around your predictions or around you parameter estimates are usable, then that's fine as well. All of this is context and problem dependent, there are no rules of thumb, every data set is different, every application is different, every use is different.

--

Paige Miller

Paige Miller

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.