Hi,
I am confused why I am having to translate 3 of my 4 categorical variables with missing values that were cast as $Char4. into numerical variables in order for the PROC MI statement with FCS to work. I thought the whole point of PROC MI > FCS > LOGISTIC was to impute when the data is categorical. In fact, I've used it in the past to impute a "Job" field on an insurance dataset that was missing values so I know it works --though that was cast as a Text13. field. Does that matter????
I don't get why the PROC MI is failing for the Memory_Technology field which is a $char18. categorical field.
Also, if I use the LOGISTIC model which I thought I was reading was for categorical variables it returns non-categorical variables (eg. 1022.7814 when 1000 is a possibility or the next nearest category 1024).
Please see the attached data file and the code I am using.
I got the right results using this for 3 of the 4 fields (categories were appropriately generated but not for Memory_Technology):
PROC MI DATA=mydata1.comp2 seed=123 nimpute=15 OUT=impRSLTS;
CLASS Memory_Technology Max_Horizontal_Resolution Installed_Memory
Processor_Speed Processor Manufacturer Operating_System;
FCS NBITER=5 DISCRIM(Memory_Technology Max_Horizontal_Resolution Installed_Memory Processor_Speed/details); * Use 5 burn-in iterations;
VAR Memory_Technology Max_Horizontal_Resolution Installed_Memory Processor_Speed
Processor Manufacturer Warranty_days
dv_Infrared dv_Bluetooth dv_DockStnPrtRep_yy dv_DockStnPrtRep_yn
dv_DockStnPrtRep_ny dv_DockStnPrtRep_nn dv_Fingerprint dv_Subwoofer
SQRTprice dv_CDMA Operating_System dv_Ext_Battery;
RUN;
It's showing the patterns for impute with 'X' never missing for Memory_Technology but it is missing in 63 cases.
*UGH!*
Hi,
The problem lies with the data. Run the following syntax:
proc freq data = laptops_dataset_raw;
tables Memory_Technology;
run;
It shows that MEMORY_TECHNOLOGY has no missing values, but it does have values are coded as '?'. A question mark is not recognized as a missing value in SAS. Recode the question marks to SAS missing values. Note that this should also be done for other variables that use question marks (e.g., MAX_HORIZONTAL_RESOLUTION). After the data are cleaned up, try running PROC MI again.
Best,
Steve
Yes...that was done prior to running the PROC. That's how I got the three
values generated that were numerical. It is the categorical variable that
is failing. Even with and missing value replacing the question mark.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.