BookmarkSubscribeRSS Feed
CloudcroftS_NWU
Fluorite | Level 6

Hi,

 

I am confused why I am having to translate 3 of my 4 categorical variables with missing values that were cast as $Char4. into numerical variables in order for the PROC MI statement with FCS to work. I thought the whole point of PROC MI > FCS > LOGISTIC was to impute when the data is categorical. In fact, I've used it in the past to impute a "Job" field on an insurance dataset that was missing values so I know it works --though that was cast as a Text13. field. Does that matter????

 

I don't get why the PROC MI is failing for the Memory_Technology field which is a $char18. categorical field.

 

Also, if I use the LOGISTIC model which I thought I was reading was for categorical variables it returns non-categorical variables (eg. 1022.7814 when 1000 is a possibility or the next nearest category 1024).

 

Please see the attached data file and the code I am using.

 

I got the right results using this for 3 of the 4 fields (categories were appropriately generated but not for Memory_Technology):

 

PROC MI DATA=mydata1.comp2 seed=123 nimpute=15 OUT=impRSLTS;
        CLASS Memory_Technology Max_Horizontal_Resolution Installed_Memory
              Processor_Speed Processor Manufacturer Operating_System;
    FCS NBITER=5 DISCRIM(Memory_Technology Max_Horizontal_Resolution Installed_Memory Processor_Speed/details); * Use 5 burn-in iterations;
    VAR Memory_Technology Max_Horizontal_Resolution Installed_Memory Processor_Speed
        Processor Manufacturer Warranty_days
        dv_Infrared dv_Bluetooth dv_DockStnPrtRep_yy dv_DockStnPrtRep_yn
        dv_DockStnPrtRep_ny dv_DockStnPrtRep_nn dv_Fingerprint dv_Subwoofer
        SQRTprice dv_CDMA Operating_System dv_Ext_Battery;
RUN;

 

It's showing the patterns for impute with 'X' never missing for Memory_Technology but it is missing in 63 cases.

 

*UGH!*

3 REPLIES 3
StatsGeek
SAS Employee

Hi,

 

The problem lies with the data. Run the following syntax:

 

proc freq data = laptops_dataset_raw;
   tables Memory_Technology;
run;

 

It shows that MEMORY_TECHNOLOGY has no missing values, but it does have values are coded as '?'. A question mark is not recognized as a missing value in SAS. Recode the question marks to SAS missing values. Note that this should also be done for other variables that use question marks (e.g., MAX_HORIZONTAL_RESOLUTION). After the data are cleaned up, try running PROC MI again.

 

Best,

Steve

 

CloudcroftS_NWU
Fluorite | Level 6

Yes...that was done prior to running the PROC. That's how I got the three
values generated that were numerical. It is the categorical variable that
is failing. Even with and missing value replacing the question mark.


peeeeekaaaaa
SAS Employee
Hi,
Well, although you tried to replace the question marks, you did not actually... There is a difference between numeric and categorical missing values in SAS. The '.' character is a valid character for character variables, not a missing value. You have several options, e.g. use IF(Memory_Technology='?') THEN Memory_Technology=''; or IF(Memory_Technology='?') THEN call missing(Memory_Technology); in the data step.

Note also that comparisons like IF(Processor_Speed='?') do not make much sense for numeric variables since, well, they are numeric and '?' is a character...

Please consider taking some of the free SAS tutorials found at http://support.sas.com/training/tutorial/ You should find the "Free SAS Programming 1 e-Course" particularly useful.

Best,
PK

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1990 views
  • 5 likes
  • 3 in conversation