I really like Howles' solution, it's elegant and simple, but in talking with a coworker he prefers not to use proc sql in his code. Ksharp's solution is good for him but was problematic as the sorting direction needed to change placing the nulls first. He had to change the temporary value in his code to correct this. The example was a dataset containing multiple sub-actions to a case_id. He was asked which sub-action is the most impacting for each case_id. To accomplish, he sorted the data on multiple columns with case_id as the first criteria. Then he sorted the data again with proc sort nodupkey by case_id to return the top record for each case_id. If his original sorting criteria is correct, he will return the most impacting sub-action for each case_id. However, when he needed to update the sorting direction of a column it placed weight on the nulls which caused the wrong case_id's to be returned. Our solution was to expand on Ksharp's idea by adding columns using the missing() function. Then we could still use proc sort and not worry if he had to change the sort direction. data test; ismissingcol1 = missing(col1); ismissingcol2 = missing(col2); run; proc sort data = test; by case_id ismissingcol1 col1 ismissingcol2 descending col2; run; proc sort nodupkey data = test out = test_deduped (drop=ismissingcol1 ismissingcol2); by case_id; run;
... View more