I am creating indicator variables for character variables with a dataset imported from excel. In this dataset there are multiple misspellings, extra spaces, etc. so I copied and pasted each value of the variable from a PROC FREQ to get something like the below: IF oldvar='oldvaluex' OR IF oldvar='olddvaluex' OR IF oldvar='oldvaluex ' ...etc. then newvar=0;
ELSE if oldvar='oldvaluey' OR IF oldvar='old valuey' OR IF oldvar=' oldvaluey '... then newvar=1;
ELSE newvar='.'; In comparing oldvar with newvar using PROC FREQ I noticed some variations of 'oldvar' were being coded as missing. In searching through these forums I found that it was likely a spacing problem and by using oldvar=compress(oldvar,,'C') prior to creating my indicator variable, these values were now appropriately coded to newvar. However, I am now working on another variable which is still producing missing values after using oldvar2=compress (oldvar2,,'C') and also attempting oldvar2= Compress(oldvar2,'0D0A'x). Any ideas on what else I could try?
... View more