I work with a government agency's SAS program that ranks a series of choices. Everything works great when there are differences. However, I have observed that when sorting/ranking the choices, where two choices have identical values for all values, it nevertheless splits ties using some unknown mechanism that appears to go outside of the variables in the sort/rank by language. It appears that the unknown mechanism is to look for differences in a variable that is defined by taking the absolute value of a difference and choosing the naturally positive value over the naturally negative value (i.e., the value before wiping away the sign). For example, take a pair where the only difference is the LOTDIFF a variable which is defined as an absolute value of the difference between two values. Even though the absolute values of the two choices are equal (in my case =1), if you remove the ABS from the variable calculation, the natural values of the two choices would be +1 and -1. SAS consistently sorts/ranks the +1 observation over the -1 observation. Does this ring a bell and is this mystery sorting mechanism documented anywhere?
LOTDIFF = ABS(USLOT - HMLOT);
PROC SORT DATA = P2PMODELS OUT = P2PMODELS;
BY &USMANF &USPRIM USLOT &US_TIME_PERIOD &YEARMONTHU &USMON
&USCONNUM NVMATCH &DIFCHAR LOTDIFF WNDORDER COSTDIFF;
RUN;
DATA P2PMODS P2PTOP5;
SET P2PMODELS;
BY &USMANF &USPRIM USLOT &US_TIME_PERIOD &YEARMONTHU &USMON
&USCONNUM NVMATCH &DIFCHAR LOTDIFF WNDORDER COSTDIFF;
IF FIRST.&USCONNUM THEN
CHOICE = 0;
CHOICE + 1;
IF CHOICE = 1 THEN
OUTPUT P2PMODS;
IF CHOICE LE 5 THEN
OUTPUT P2PTOP5;
RUN;
... View more