turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- computing the weights of categorical variables

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-19-2013 12:54 PM

Hi,

I have a dependent continuous variable and 3 dependet categorical variables (for instance - day in week(1-7), day in month (1-31), and month (1-12))

and I want to know according history data - what weight to give to each level in each category so I can predict the dependent continous variable based on the day.

The wight needs to be in precentage.

I made proc glm on the three variables and got a significancy in every one of them.

How do I determine what precentage to give to each level of each variable, what test / procedure will give me that precentages?

Thanks in advance,

Liat

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-20-2013 10:15 AM

Hi,

Generalized Additive Model is another choice. Smoothing and other flexibility in modeling is all in PROC GAM.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-20-2013 10:25 AM

Thanks for your answer. But I dont understand, Can you be more specific?

here is the data for example -

volume , day_in_week, day_in_month, month

1200,1,24,11

801,4,31,7

600,7,5,2

When I use Proc GLM it gave me beta estimatores for dummi variables so it is already include the volume

my line looks like this where X1=X2=X3=1

Y= intercept+ b1*X1+b2*X2+b3*X3

So I cannot user the beta estimatores as precentage.

What proc will out[ut the bet's as a precentage and not as a number of volume.

Thanks

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-20-2013 03:11 PM

PROC FREQ will certainly do something to give percentages, although they won't be beta-hats, or weights.

I don't know what good this will do, though. What is the dependent variable? Is it volume? Do you wish to predict volume, given a day-month-year? What are you going to do about trends/seasonality/autocorrelation? I think a time series analysis might be of far more utility.

More information on the proposed analysis would be helpful.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-20-2013 03:57 PM

Hi Steven, here is answers for your questions -

What is the dependent variable? Volume

Is it volume? Yes

Do you wish to predict volume, given a day-month-year? Yes, exactly.

What are you going to do about trends/seasonality/autocorrelation? I was thinking of adding another indicator variables for fast day and and holiday day. which will take care of the seasonality and the rest the model will deal/find.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-20-2013 04:18 PM

More Clarifications - I wish to predict volume, given a day-month-year + day in week (sunday/monday/...and holiday indicator)

looking to know what weight contribute each of the component (all categorical variables).

As the volume is not necessarily spread uniqly (linear) throw the month. for example: if I'm at the begining of the month I might see a bad picture of what it may look at the end of the month. So I'm looking for some beta-hats, or weights for the day/month/day_in_week

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

11-21-2013 09:42 AM

This is starting to sound like a combination time-series/data mining problem. I'm going to have to defer to others for an approach. You may wish to post a similar question in the SAS Forecasting and Econometrics forum, where you may get eyes on the problem from those who deal with this sort of data more regularly.

Steve Denham