Perhaps you could elaborate on how the data were collected and what the random effects structure is like for your data. A couple of things which I would be interested in are as follows:
1) Are there multiple levels of random effects, or can the random effects be modeled using a single subject specification? To answer this question well requires elaboration to some degree on the experimental design.
2) Is the residual bimodality related to between-subject differences? If so, what characterizes subjects? If you can identify a subject-level variable which is related to the bimodality, then you can include this in your modeling efforts and should be able to eliminate the problem of bimodality. Of course, you could end up with a situation in which you determine that there are between-subject differences, but you cannot immediately determine any variable which is related to these differences.
If you cannot determine any reason for the bimodality and if random effects can be modeled through a single subject specification, then it may be possible to write code employing the NLMIXED procedure which accounts for the bimodality. There are two different types of model which might be constructed depending on whether the bimodality is attributable to between-subject differences or whether the bimodality is attributable to within-subject differences.
If the bimodality is attributable to between-subject differences, then we could employ a model of the form
P1*f(y,x,beta,b1) + (1-P1)*f(y,x,beta,b2)
where b1 and b2 are random effects with means mu1 and mu2, respectively. The fixed effects are assumed to be the same for the two different sets of subjects.
If the bimodality is attributable to within-subject differences, then we could employ a model of the form
P1*f(y,x,beta1,b) + (1-P1)*f(y,x,beta2,b)
The assumption of this model is that there are different sets within subjects. Typically, one might assume only intercept differences between the two within-subject sets. However, one could extend the differences to differential effects of predictor variables.
Mixture distributions are really quite intriguing. They offer the opportunity to identify - or at least speculate on - some as yet unknown source of of significant variation.
In order to provide more specific assistance, it would help to know more about the problem.