BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Addu
Calcite | Level 5

Hello!

 

I'm doing logistic regression analysis for occurrence of a fungal disease of oat in four soil treatments, with four replicates. I use the proc GENMOD. Four soil treatments are "no till", "spring cultivation", "fall cultivation" and "till". Fgram is the independent variable, the occurrence of fungal disaese out of 100 samples. Total sample size per plot is 100 grains of yield.

 

For some reason the replicate's degrees of freedom equals to zero and I get no results from them. 

 

I have similar other datasets and the analysis goes just fine.

 

What could be the problem? Can I still use the probabilities from the treatments?

 

My dataset:

 

treatmentplotreplicatefgramNtot
till101214100
fall cult102419100
no till10315100
spring cult10438100
fall cult201413100
till20226100
spring cult20339100
no till20419100
spring cult30138100
no till30216100
till303215100
fall cult30448100
spring cult40132100
no till402110100
till40328100
fall cult40448100

 

My code:

proc GENMOD data=WORK.tillageyield;
Title "Logistic regression analysis for f.graminearum occurrrence in yield, treatments are compared to till";
	Class treatment (ref="till") replicate/param=ref;
	model fgram/Ntot = treatment replicate/
	dist=binomial
	link=logit
	waldci;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

@Addu wrote:

Ok. I don't understand how this experiment is different to my other ones. In my other analyses the replicates get 1 degree of freedom.

 


Did you actually look at the data in this experiment? Looking at the data is a highly recommended debugging technique. As I said, treatment is completely confounded with replicate. Replicate adds no additional information.

--
Paige Miller

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26

Treatment and replicate are perfectly correlated. Replicate adds no new information, and so you cannot estimate its effect (or said another way, you should get 0 degrees of freedom for replicate).

--
Paige Miller
Addu
Calcite | Level 5

Ok. I don't understand how this experiment is different to my other ones. In my other analyses the replicates get 1 degree of freedom.

 

Here's an example.

 

Effect of two fungicide treatments and a control treatment to fusarium occurrence in cereal yield samples. Ignore the sample and plot columns. 

 

SamplefungicideCultivarReplicatePlotFgramNtot
yieldcontrolViviana111110100
yieldcontrolViviana21128100
yieldcontrolViviana31139100
yieldcontrolViviana411415100
yieldcontrolMarika112116100
yieldcontrolMarika21227100
yieldcontrolMarika31238100
yieldcontrolMarika412412100
yieldcontrolPeppi11317100
yieldcontrolPeppi213212100
yieldcontrolPeppi31338100
yieldcontrolPeppi41347100
yieldcontrolVoitto114129100
yieldcontrolVoitto214242100
yieldcontrolVoitto314332100
yieldcontrolVoitto414434100
yieldcontrolAnniina115117100
yieldcontrolAnniina215247100
yieldcontrolAnniina315330100
yieldcontrolAnniina415423100
yieldDelaroViviana12114100
yieldDelaroViviana22125100
yieldDelaroViviana32132100
yieldDelaroViviana42144100
yieldDelaroMarika12219100
yieldDelaroMarika222214100
yieldDelaroMarika322310100
yieldDelaroMarika422410100
yieldDelaroPeppi123110100
yieldDelaroPeppi22325100
yieldDelaroPeppi32332100
yieldDelaroPeppi42349100
yieldDelaroVoitto124114100
yieldDelaroVoitto224214100
yieldDelaroVoitto324314100
yieldDelaroVoitto424415100
yieldDelaroAnniina125116100
yieldDelaroAnniina22528100
yieldDelaroAnniina32539100
yieldDelaroAnniina42542100
yieldProlineViviana13113100
yieldProlineViviana23121100
yieldProlineViviana33137100
yieldProlineViviana43147100
yieldProlineMarika13218100
yieldProlineMarika23229100
yieldProlineMarika33238100
yieldProlineMarika432412100
yieldProlinePeppi13314100
yieldProlinePeppi233212100
yieldProlinePeppi333311100
yieldProlinePeppi433411100
yieldProlineVoitto134115100
yieldProlineVoitto234215100
yieldProlineVoitto334318100
yieldProlineVoitto434411100
yieldProlineAnniina135112100
yieldProlineAnniina23526100
yieldProlineAnniina335311100
yieldProlineAnniina43542100

 

 

Proc GENMOD data=WORK.fungicide; 

Title "Logistic regression analysis for Fusarium occurrence in fungicide treated cereal yield samples, treatments compared to control, cultivars compared to Anniina";
	Class Fungicide (ref="control") Cultivar (ref="Anniina") Replicate/param=ref;
	model fgram/Ntot = Fungicide Cultivar Replicate Fungicide*Cultivar/
	dist=binomial
	link=logit
	waldci;
run;

Results table attached. Notice the 1 df in the replicates.

 

I did run this by my teacher and he checked it ok. He was a SAS genius. He sadly is no longer with us, that's why I'm asking here.

 

-Addu

SteveDenham
Jade | Level 19

Sort your data by replicate.  You will see that for replicate=1, you only have treatment='no till'.  Compare that to the fungicide dataset.  When sorted by replicate, each combination of fungicide and cultivar appears.  This would be expected for a randomized block design.

 

I want to point out two things to consider.  The test in the solution table is for the individual coefficient=0, not for the effect of cultivar, fungicide or replicate.  Change up the genmod code to add type3 as an option in the MODEL statement to get these more global tests.

 

Second, if this is indeed a randomized block design, and you want to infer to a broader space than just the replicates in the study, you should change over to a mixed model approach with replicate as a random effect.  For the fungicide data (which is currently the only one that can be analyzed properly) the code would look like:

 

Title "Logistic regression analysis for Fusarium occurrence in fungicide treated cereal yield samples, treatments compared to control, cultivars compared to Anniina";
proc glimmix data=the_data_above;
	Class Fungicide Cultivar Replicate;
	model fgram/Ntot = Fungicide Cultivar Fungicide*Cultivar/
	solution
	waldci;
        random intercept/subject=replicate;
        lsmeans fungicide cultivar fungicide*cultivar/ci;
run;

Now the comparisons of interest should depend on the results of the F tests.  If the interaction is significant, you would need to look at the simple effect using either the SLICE statement, or the slicediff option in the lsmeans statement.  If it is not significant, using the diff=control option for the main effect lsmeans will yield the tests in the title.  These would look like:

 

lsmeans fungicide*cultivar/slicediff=(fungicide cultivar) slicedifftype=control ('control' 'Anniina') ilink;

lsmeans fungicide cultivar/diff=control ('control' 'Anniina') ilink;

Edited to add ilink to the lsmeans to get the results back on the original scale.  Also, recall the differences will not transform back to differences in probabilities using this method.  For that you will need to invoke the NLMeans macro.

 

SteveDenham

 

 

Addu
Calcite | Level 5

Thank you very much for your input on the fungicide data! I'll try out your suggested modifications.

 

-Addu

PaigeMiller
Diamond | Level 26

@Addu wrote:

Ok. I don't understand how this experiment is different to my other ones. In my other analyses the replicates get 1 degree of freedom.

 


Did you actually look at the data in this experiment? Looking at the data is a highly recommended debugging technique. As I said, treatment is completely confounded with replicate. Replicate adds no additional information.

--
Paige Miller
Addu
Calcite | Level 5

Thank you for your help!

 

I had mistook block markings as replicates numbers. Embarrassing, but someone else had to point it out. Datasheet blindness.

 

As you said a good debug is looking at the data - which I did, many times! Another good one is to ask someone else to also take a look.

 

-Addu

 

 

PaigeMiller
Diamond | Level 26

@Addu wrote:

 

As you said a good debug is looking at the data - which I did, many times! Another good one is to ask someone else to also take a look.


True, true!

--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 732 views
  • 1 like
  • 3 in conversation