Hi folks! I am working on my mixed model (random coefficients model). The response variable is Views. I have a few factors that I consider to be both fixed effects and random effects for my model. They are Home_Based, Twitter, OLA, Hulu, Car, Central. My class/subject variable is top_ind, which has two levels, 1 and 2. I've written my code as below to start with only 1 factor, and it does not converge. I'm wondering if anyone can offer insights into why my model is not converging. I've attached the data as well. Thanks!
proc mixed data=MIXED_DATA ITDETAILS LOGNOTE;
class Top_IND;
model Views= ola
/ ddfm=kr2 solution;
random int ola / subject=Top_IND type=un g solution;
run;
I would suspect a major issue with the convergence is the large number of records with OLA=0.
When your independent variable bounces from 0 to something relatively large like the values for OLA not zero with similar (at least in terms of order of magnitude) dependent values things don't often go well.
Are you sure the OLA should be 0 and not "missing"? I would be strongly tempted to create a second data (or second OLA variable in the same set) with the values of 0 set to missing and see what the proc says when you use that variable instead of OLA.
Thanks for your quick reply! Those cells are indeed 0's. Basically, I have the time-series data, and views are supported by different channels, and OLA is one of them. It is 0 when it is off. I did give it a try to run the procedure only OLA is not 0, but it also did not converge.
I think that the random slopes model you are trying to fit may not be appropriate. The iteration history indicates that the variance for the intercept has ballooned and the variance for slope of ola has gone to zero. This can happen when fitting a continuous variable as both a fixed and random effect. Instead, try moving the slope to an R-side model with a REPEATED statement. Perhaps something like:
proc mixed data=MIXED_DATA ITDETAILS LOGNOTE;
class Top_IND;
model Views= ola
/ ddfm=kr2 solution;
random int/ subject=Top_IND g solution;
repeated ola/subject=Top_IND r solution;
run;
The main problem I see is that the data are zero-inflated, with the zeroes all assigned when ola is zero. Additionally, I suspect that the dependent variable "views" is a count variable, so the assumption of normality for the residuals is likely to be violated. If that is the case, PROC MIXED may not be the best choice for your analysis.
One way of forcing the predicted response through zero would be to add the noint option to the MODEL statement. Since there are no other independent variables, this is equivalent to a hurdle model with probability = 1 of a zero when ola = 0, except that the zeroes are given a weight in the regression, rather than being excluded I know you ran the model with the zeroes excluded, but it did not converge (likely for the misspecification mentioned above). Please give what is in the code box a shot on a data set with zeroes excluded or modifying what is in the box with the noint option for the full dataset.
SteveDenham.
Thanks a lot for your reply. I did give your code a shot and it returned an error message "Only CLASS variables allowed in this effect."Only CLASS variables allowed in this effect." Please see the screenshot.
I also tried the noint option, and it didn't work (not sure if I'm having the right code). I'm wondering what are CovP1 CovP2 CovP3 CovP4 in the Iteration History.
proc mixed data=MIXED_DATA ITDETAILS LOGNOTE;
class Top_IND;
model Views= ola
/ ddfm=kr2 noint solution;
random int ola/ subject=Top_IND type=un g solution;
run;
Views are the number of website visits, and OLA, Twitter, Home_based are all continuous variables contributing to views. I do think their respective coefficients will be different when top_ind=1 vs when top_ind=0. So if proc mixed is not the appropriate approach, what approach do you think would be more appropriate? must I run two separate regression models?
Thanks again for your help!
Thanks for sharing that. So you have only 2 subjects for top_IND? I would consider the following model in that case:
proc mixed data=MIXED_DATA ITDETAILS LOGNOTE;
class Top_IND;
model Views= ola top_IND old*top_IND
/ noint solution;
run;
SteveDenham
I would also try rescaling your data. Iterative methods can get messy when the response and continuous predictors are on a very large scale. Divide OLA and VIEWS by a large constant (like 1e7) and see if that helps stabilize convergence.
Thanks for your reply. I did try removing all rows when OLA=0, and rescale Views and OLA. However, it still did not converge.
data MIXED_DATA;
set MIXED_DATA;
if OLA ne 0;
views10=views/10000000;
ola10=ola/1000000;
run;
proc mixed data=MIXED_DATA ITDETAILS LOGNOTE;
class Top_IND;
model Views10= ola10
/ ddfm=kr2 solution;
random int ola/ subject=Top_IND type=un g solution;
run;
You are in trouble with type=un. Note that it is essentially a machine zero for UN(2,2) (covp3)and practically zero for UN (1,2)(covp2). Change to type=vc, or remove the type= option completely. Also note that your objective function has stabilized, so you could change your convergence criterion as a stopgap. However, as long as you have those near zeroes in the covariance matrix, the final G matrix is not positive definite, which says you have too many covariance parameters.
SteveDenham
I tried type=vc, but it didn't work. Then I removed the type= option and it worked (along with rescaling both views and ola). Thanks a lot for your advice! What does it mean when I remove the type= option completely, and when type= vc? What does it mean to change the convergence criterion to stopgap? What is the syntax to do so, or to change the convergence criterion to a larger number?
Thanks again!
you will need to use ola10 on the random statement in place of ola.
Thanks for pointing it out! Rescaling alone would not fix the issue, but when doing it along with removing the type= option does the job, but I'm sure why it works that way.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.