Help using Base SAS procedures

How do I code categorical/multi-nomial data in GLIMMIX?

Reply
Regular Learner
Posts: 1

How do I code categorical/multi-nomial data in GLIMMIX?

Hello all,

 

I'm trying to code plant measurement data (i.e. % cover, health rating 0-5, etc). into glimmix, however, I'm unsure how to go about it. I've had no issues doing soils data. Here is my code for my soils data:

 

data soildata; set soildata;
if horizon="US" then delete;
if horizon="LS" then delete;
run;
proc print data=soildata;
RUN;

proc glimmix data=soildata;
title "TS EC";
Class rxn; /* rxn is my treatment prescription */
Model ec=rxn/ddfm=satterthwaite;
lsmeans rxn/ adjust=tukey plot=mean lines;
run;

What do I need to add to deal with non continuous values? I know I need to include Dist=multinomial, but is there anything else? I suspect I also need a link function, but don't understand what those are, or what they do. Any help here would be great.

 

 

 

Frequent Contributor
Posts: 98

Re: How do I code categorical/multi-nomial data in GLIMMIX?

[ Edited ]

JDL523 wrote:

 

What do I need to add to deal with non continuous values? I know I need to include Dist=multinomial, but is there anything else? I suspect I also need a link function, but don't understand what those are, or what they do. Any help here would be great.

 

 


Based on this, it sounds like you need to spend some time learning about the theory behind generalized linear models before you can be given any meaningful advice. Look through the "DETAILS" subsections for PROC GLIMMIX and PROC GENMOD as a starting point, and from there start looking at citations (or start Googling) elements of those descriptions you don't understand.

 

From a practical perspective, you don't necessarily need to define a link function so long as you define a distribution; as you can see in the documentation, each distribution carries with it an implicit link function that is applied by default when that distribution is used. In most applications, the default link function will be the appropriate choice (conditional on the assumption the distribution is correctly specified, naturally). However, unless you understand the purpose ot the link function and its role in a GLM framework, I wouldn't recommend trying to fit the model at all, because it will be difficult for you to properly interpret.

Ask a Question
Discussion stats
  • 1 reply
  • 195 views
  • 0 likes
  • 2 in conversation