Hi,
The act to impute is mainly to keep the observations, in other words to preserve the model universe. The price is how much distortion you can accept and pay.
The least distortive is to find out the reason behind the missing values. It is very rare that the model universe has all its data sourced from just 1 or 2 tables. It is almost always the case that the model universe is assembled from various sources, easily in the range of >10 tables. One top source of missing values is what I sometime call 'left join syndrome'. The left side table is your master table of 160,000 IDs, but the right hand side table may only have 52% ID that can match to the left hand side. So you have ~48% missing on all the variables you are appending from the right hand table. Now the nature of the right hand table is key to your imputation. It is not really technique. It is business knowledge. The question is if a table missing 48% is overall useful at all. If the answer is YES, then you can dive into individual variables. It is a good practice to keep a 'missing lineage' when you merge throughout the universe preparation process.
After you go through business background, here are some rules of thumb. For categorical /nominal variables with >50% missing, I would drop them, regardless if this is clustering or supervised model. Because if you have many categories, you group the missing portion with one of them, you have no ground to promote that non-missing group to dominate the variable. The artificial impact of this practice is more severe in clustering than supervised model. In, say a decision tree modeling, carrying missing values as it is can add value to the model with little distortion. Clustering does not have this sort of mechanism. If you assign a unique value to replace the missing portion, you create a dominating but artificial value. This is where it becomes tricky depending on how your clustering solution parametrizes the categorical variable. If the categorical variable has >> 50% non-missing, I am comfortable grouping the missing portion with one of the non-missing groups. In SAS clustering, there is a random option that allows you to impute according to the distribution of the non-missing. This actually is available to both categorical and interval variables.
As for interval variables, if you have many input variables to spend, you can afford to raise the non-missing requirement % bar and drop more variables. If you don't have many variables, you may tolerate variables that have many missing values. In playing with the requirement %, you need to closely consult definition of the variable. Some 'important' variables having large % of missing may have to stay. In other words, you need to balance. One recommendation is to try different imputation methods (means, median, random) and assess their respective impact on your clustering solution. When there are many variables, you may consider variable clustering with different imputation methods and assess impact accordingly. There is no fast rule which way is better. This is where packages like SAS EM provide a huge productivity edge in that it documents and compares more efficiently than code-programming.
One thing special about clustering is scale. It is necessary to scale all the input variables together for clustering. Whether you should impute before or after re-scaling/ standardization is another layer of complexity. There are other aspects related to what distance measure you use in your clustering. I will leave that to another day.
Hope this help? Thanks for using SAS. Best Regards Jason Xin