BookmarkSubscribeRSS Feed
alexgouv
Obsidian | Level 7

Hi,

 

I am trying to create a logistic regression model using statistics from the past 3 years in a college baseball conference. I am running into issues because many of the players have no data for key variables in my model. Is there a way to have SAS ignore the missing variable for an observation without completely getting rid of that player?

 

For example, I have columns from freshman to senior year for each statistic for each player. However, some of these players missed a year due to injury, or have not yet reached their senior year, etc. Right now SAS just throws out all of the other data that player has, but I want SAS to use whatever data the player has and only eliminate the player if all columns are blank. 

 

I am really looking for a way to do this without imputation. If there is a way to do this in R that would work too.

3 REPLIES 3
PaigeMiller
Diamond | Level 26

You can tell PROC LOGISTIC to consider missing values in a CLASS variable to be legitimate values, but this doesn't work for continuous variables.

 

https://documentation.sas.com/?cdcId=pgmmvacdc&cdcVersion=9.4&docsetId=statug&docsetTarget=statug_lo...

 

I think you'd have to impute the values, or come up with some other scheme to handle this type of data.

--
Paige Miller
alexgouv
Obsidian | Level 7
Thanks, but I was looking for how to handle continuous variables somehow.
Ksharp
Super User

Yeah. I also think you should impute missing value. One way is using PROC PLS .

 

proc pls data=class  missing=em   nfac=4 plot=(ParmProfiles VIP) details; * cv=split  cvtest(seed=12345);
 class sex;
 model age=weight height sex;
* output out=x predicted=p;
run;