Hi, I have a dataset as below. There are different number of responses nested within each cluster, each response came from a subject with same or different job levels, I would like to aggregate the information of V1 and V2 to the cluster level using the following criteria: 1. Use the values of V1 and V2 from the subject with the highest job level (1>2>3). 2. If there are more than one responses from the highest job level, the cluster level variable will be 1 if any response from the highest job level is 1. 3. If there are more than one responses from the highest job level and some of them are missing, the cluster level variable will use the response from the nonmissing values if it is 0. 4. If V1 or V2 from the highest job level are all missing, then move to the next job level. data a; input clusterid joblevel v1 v2; datalines; 1 1 . 1 1 1 . 1 1 3 0 0 1 3 0 . 1 3 1 0 2 1 1 1 2 2 0 . 2 3 . 1 2 3 0 0 2 3 1 . 3 1 . 1 3 1 0 0 3 2 1 1 3 2 1 0 3 3 0 0 3 3 . . 3 3 0 1 ; run; I am hoping to get a data like this: clusterid V1 V2 1 1 1 2 1 1 3 0 1 Thanks A lot~
... View more