Is there a way to score data based on a store from a previously executed model but exclude some parameters. For example, let's say my model is
model y = x1 + x2 + x3 + x4;
When I store the parameters and score a new dataset based on this store, Y would be calculated as the sum of an intercept and X1 through X4. I only want to score using intercept and X1 through X3--I want to skip X4.
Thanks,
Haris
Can you use a value of zero for x4?
Yes, I could do that but I don't want to loose the information in X4. Renaming X4 to X4_1 and setting X4 to zero would do exactly what I need. I am looking for an option to drop X4 from scoring without manipulating the original data.
Well, then, mark me down as confused.
I don't understand this restriction
Renaming X4 to X4_1 and setting X4 to zero would do exactly what I need. I am looking for an option to drop X4 from scoring without manipulating the original data.
You don't have to destroy the original data set. You can copy it and rename X4 and the original data set remains unchanged.
Yes, you are correct: I can manipulate the dataset to achieve what I need. I am lazy! I am hoping there is an option that goes along wtih PROC PLM SCORE statement, something like 'SKIP X4' or 'SET X4=0' that SCORES without using the X4. Do you think that's too much to ask?
Alternatively, is there a way to edit the STORE and change the parameter for X4 to zero? Store files are small and I would have no problem creating copies of the STORE. My datasets are massive in size.
You don't have to actually create a new dataset. You can define a view that will filter your data only when it is accessed by PLM.
Another good alternative, PGStats, but it is still focused on manipulating the original data--non-destructively, in your case.
But, is it the case that there is no option in PROC PLM to skip or change parameters in the STORE when scoring?
If you want the model Y = x1 + x2 + x3, you need to go back to the data and fit the new model. Setting x4=0 in the scoring data is a valid operation, but "dropping x4" from the fitted model does not make statistical sense because the other parameter estimates will change when you drop x4 from the model. This is the reason that you can't edit the item store that is read by PROC PLM. It prevents people from getting wrong answers.
Thanks, Rick. I am not exactly clear what you mean by:
Setting x4=0 in the scoring data is a valid operation
Can I do this when scoring or is this something I could do when creating the STORE?
With respect to 'does not make statistical sense', I am trying to do something similar to estimating BLUP vs NOBLUP predicted values just with fixed rather than random effects. X4 is an effect of a geographical region. But statistical the validity of dropping fixed parameters during scoring operation is a whole different topic. I am only interested in the capabilities of PROC PLM at this point.
Setting x4=0 was @PaigeMiller 's original suggestion. You do it by using a DATA step view, as @PGStats suggested.
Since you mentioned mixed models and EBLUPs, I will mention that the SCORE statement in PROC PLM produces that same scores as the OUTPM= data set in PROC MIXED. That is, the predicted values do not incorporate the EBLUP values. The predicted values are simply X*beta_hat. Thus if x4 is a random effect, I think the SCORE statement does what you want.
If x4 is a fixed effect in your model, I do not believe there is a way to force the SCORE statement to exclude the x4 effect.
Thanks @Rick_SAS.
My X4 is a FIXED effect so it seems like I will need to find a way to exclude it from the score outside of the native PROC PLM functionality.
And since you mentioned it, the ability to score RANDOM as well as FIXED effects in PROC PLM would be very helpful in my work. Right now the only way I know how to work out combinations of FIXED and RANDOM effects in mixed models where linear formulas for variance don't exist is the ESTIMATE statement in GLIMMIX. Given how long some of the models take to estimate, the lack of ability to calculate inferential tests for combinations of FIXED and RANDOM effects in PROC PLM is limiting.
I hope I am wrong and there is a way to do that that I am not aware off.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.