I have a large data set with about 10000 clusters (each with about 5-10 data points). There are about 30 variables in the dataset. I need to aggregate by cluster. Variables will aggregate differently (mostly count or mean). I do not want to retain any duplicate, non-aggregated data -- just one datapoint for each cluster. What is the simplest way to do this? I know I could do these with proc means and creating all new variables, but thought there might be a better way?
... View more