Programming the statistical procedures from SAS

N-way Anova duration

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 12
Accepted Solution

N-way Anova duration

Hello,

 

I am trying to use N-way Anova to check the dependency of a variables on my Response variable. But the problem is the execution has been going on for around 1 hour and still its not over. Below is the data structure. Does it really this long to process the data ?

 

BouncesExitsContinentSourcegroupTimeinpageUniquepageviewsVisitsBouncesNew
00OC(direct)18100
00N.America(direct)4100
00N.AmericaOthers35100
00N.Americapublic.tableausoftware.com70100
00N.Americapublic.tableausoftware.com81100
00N.Americapublic.tableausoftware.com75100
00N.Americapublic.tableausoftware.com186100
00N.America(direct)710100
00OC(direct)712110
00ASOthers344110
00EUOthers27110
00EUvisualisingdata.com0110
00N.AmericaOthers294110
00N.Americapublic.tableausoftware.com111110
00SA(direct)1430110
00N.America(direct)29110
00N.AmericaOthers637110

 

I have put just an abstract here, totally the CSV file consists of around 32000 rows.

 

My dependent variable is Exits and i have added all the rest of the variables as categorical variables.

 

Regards,

 

Aditya

Attachment

Accepted Solutions
Solution
‎09-28-2016 08:36 AM
Grand Advisor
Posts: 16,930

Re: N-way Anova duration

[ Edited ]

Yes - with user caution that this generally isn't recommended. But it sounds like you have more categories than observations so your running into dimenstionality problems. You can also look at clustering techniques to reduce your input variables. Perhaps varclus except I'm not sure how that will work when you have a lot of categorical data. Categorical data analysis is a weakness of mine Smiley Sad

View solution in original post


All Replies
Grand Advisor
Posts: 16,930

Re: N-way Anova duration

No. I find sometimes SAS Studio will hang for an error rather than show an error. 

 

Check your selections. Also, how many unique combinations do you have compared to your N? 

 

Ie 3 level x 2 levels x ... x 2 levels = # of combinations

Occasional Contributor
Posts: 12

Re: N-way Anova duration

Hello,

 

To fix the problem I did a correlaion analysis and found that one of the variables had very less relationship of 0.00132 with the Dependent Variable and therefore when i removed it from the N-way Anova, I got the result in 5 mins.

 

Is this the right approach?

 

Regards,

 

Aditya

Grand Advisor
Posts: 16,930

Re: N-way Anova duration

Why not run individual anovas first and reduce the # of variables. 

Occasional Contributor
Posts: 12

Re: N-way Anova duration

Hello,

 

Are you suggesting run individual anovas and the variables whose P-value is not significant they should be omitted in the final model ?

 

Regards,

 

Aditya

Solution
‎09-28-2016 08:36 AM
Grand Advisor
Posts: 16,930

Re: N-way Anova duration

[ Edited ]

Yes - with user caution that this generally isn't recommended. But it sounds like you have more categories than observations so your running into dimenstionality problems. You can also look at clustering techniques to reduce your input variables. Perhaps varclus except I'm not sure how that will work when you have a lot of categorical data. Categorical data analysis is a weakness of mine Smiley Sad

Occasional Contributor
Posts: 12

Re: N-way Anova duration

Thanks

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 236 views
  • 0 likes
  • 2 in conversation