11-21-2013 11:25 AM
I have the following problem:
I fitted a model using proc genmod. I stored this model in the item-store using the STORE statement and I now intend to use this model to score a different dataset using proc plm. However, I would like to modify the stored model before using it for scoring. Is it possible to do this? My modification is something like (a bit more complicated actually): add 1 to each parameter. Naively I tried to access the stored model in a data step which doesn't seem to be possible.
Many thanks for your help
11-22-2013 10:54 AM
Add 1 to each parameter? That seems, well, unusual. I could understand adding 1 to each independent value (say if you had a log link), but adding a constant to the estimates doesn't seem like a very good idea, let alone doing something more complex. The values would no longer be the MLE's, and hence scoring would be biased Could you give the motivating reasons for the process? And maybe what the modification you want to apply will actually be.
11-23-2013 03:31 AM
Well, what intend to do is actually more like adding 1 to each variable, however, I thought it would be neat to do this via the stored model and this got me generally interested in whether I can modify these item-store objects.
I essentially want to do what Joyce et al did in in this paper http://www.ncbi.nlm.nih.gov/pubmed/12407470
They calculate adjusted mortality rates.Their correction is done by subtracting the observed value of a confounder from the mean value of this confounder in the dataset and adding this term to the observed outcome (mortality rate). So for one independent variable the adjusted rate is given by
y_adj = y_obs + b (x_mean - x_obs)
11-25-2013 11:24 AM
I think you will have to use a datastep solution. The real key here is what you use for x_mean--the mean of the x values from the original dataset (where the coefficients were derived), the mean of the x values from the new data you want to score, or the mean of the x values for the combined datasets. If we believe that the coefficients are unbiased estimators of the population, then we probably want to use the mean of the x values from the new data. It becomes a matter of outputting the coefficients to one data set, calculating x_mean (new) and saving it to a dataset, merging the mean back against the original data, calculating the difference, multiplying by the coefficient and adding to the new y value. Scale this up across all of the x's.
11-25-2013 07:55 PM
Which values to use for the calculations of x_mean was also giving me a bit of a headache. Thanks for your thoughts, very useful.
Thanks also for the datastep solution to my problem. It's not that easy, I think, that's why I wondered whether I can fiddle with a model in the item-store. Anyway, it looks like nobody got an answer to my question (or the experts are too busy) so I'll have a try at your datastep solution.