Hi there,
I am trying to reconcile different parts of the output for PROC FMM with a PROBMODEL statement. Looking specifically at Table 43.12 in the example from the SAS documentation here, I understand that looking at the mixing probabilities table is showing the probabilities on the logit scale, which can then be converted to probabilities of belonging to component 1 (the mu-hats shown on the table). However, If I include an OUTPUT statement and look at the values PRED_1 and PRED_2 for the values of the covariates shown on the table, they don't seem to align. I may be missing something, but I couldn't find documentation on PRED_1 - PRED_2 that would explain the difference, so I appreciate any suggestions!
Part two of this question is whether there is a way to obtain the odds ratios for the covariates in the mixture model, in the case that there are two components. I can back into it once the above is addressed and I am sure I am looking at the right values for the probabilities, but was curious if there is a more direct way to get this output from PROC FMM. Thanks again.
data ossi; length tx $8; input tx$ n @@; do i=1 to n; input y m @@; output; end; drop i; datalines; Control 18 8 8 9 9 7 9 0 5 3 3 5 8 9 10 5 8 5 8 1 6 0 5 8 8 9 10 5 5 4 7 9 10 6 6 3 5 Control 17 8 9 7 10 10 10 1 6 6 6 1 9 8 9 6 7 5 5 7 9 2 5 5 6 2 8 1 8 0 2 7 8 5 7 PHT 19 1 9 4 9 3 7 4 7 0 7 0 4 1 8 1 7 2 7 2 8 1 7 0 2 3 10 3 7 2 7 0 8 0 8 1 10 1 1 TCPO 16 0 5 7 10 4 4 8 11 6 10 6 9 3 4 2 8 0 6 0 9 3 6 2 9 7 9 1 10 8 8 6 9 PHT+TCPO 11 2 2 0 7 1 8 7 8 0 10 0 4 0 6 0 7 6 6 1 6 1 7 ; data ossi; set ossi; array xx{3} x1-x3; do i=1 to 3; xx{i}=0; end; pht = 0; tcpo = 0; if (tx='TCPO') then do; xx{1} = 1; tcpo = 100; end; else if (tx='PHT') then do; xx{2} = 1; pht = 60; end; else if (tx='PHT+TCPO') then do; pht = 60; tcpo = 100; xx{1} = 1; xx{2} = 1; xx{3}=1; end; run; proc fmm data=ossi; class pht tcpo; model y/m = / dist=binomcluster; probmodel pht tcpo pht*tcpo; output out = chk(keep = pht tcpo pred_:) pred(components); run; proc sort data = chk nodupkey; by pht tcpo; run; proc print data = chk; run;
I think you are on the right track with that documentation link, I would recommend you re-read that example very closely. To summarize what is in the documentation, the binomial cluster model you are fitting is a two-component mixture, where the first component is binomial with 'n' trials and success probability 'mu_star + mu', the second component is binomial with 'n' trials and success probability 'mu_star', and the mixing probabilities are represented by pi and (1 - pi). Furthermore, mu_star = (1 - mu)*pi (where pi is still the mixing probability).
They show in the linked example that the estimate for the mu parameter, 'mu_hat', is computed as the inverse link of the intercept parameter in the model for mu (specified via the model statement). In the documentation example, mu_hat is equal to 0.5831. Likewise, in Table 43.11, they show how to compute estimates for the mixing parameter ('pi_hat') based on the coefficients from the model specified in the probmodel statement.
I believe that the pred statement for this model is generating the success probabilities for each of the two components of the mixture model, i.e., pred1 is mu_star_hat + mu_hat, and pred2 is mu_star hat. So, for example, for PHT = 0 and TCPO = 0, pi_hat = 0.6546, therefore mu_star_hat is (1 - 0.5831)*0.6546 = 0.273 (rounded), and mu_star_hat + mu_hat = 0.5831 + 0.273 = 0.865. Those are the values of pred_2 and pred_1 that I get, respectively, when PHT = 0 and TCPO = 0.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.