Hi everyone, All discussions following are based on SAS Enterprise Miner. During the process of building different models I found that KNN is the most time-consuming method. For example, if the Neural Network takes few minutes to complete, KNN will take even several hours to finish modeling based on the same training and validation sample. I decide not to include KNN as a workable solution on our project, but in order to at least "get some results", I tend to apply it to smaller subsamples of both EXISTING training and validating datasets. To be specific, (1) I have already done Partition, Metadata (role assignment), Replacement, Impute, PCA before KNN and I want to apply the results of PCA (i.e. the generated principal components) to KNN method; (2) Between PCA node and KNN node, I would like to resample in both training and validating datasets with the same proportion, e.g. 10% of the exported PCA training dataset and 10% of the exported PCA validating dataset; (3) SAS Code node will not be considered, since it is not easy to operate with to other users. This is really a difficult problem for me. If any of you have any ideas, you are welcome to discuss with me and I will really appreciate it. Thanks very much!
... View more