BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
texasmfp
Lapis Lazuli | Level 10

I work with a government agency's SAS program that ranks a series of choices.  Everything works great when there are differences.  However, I have observed that when sorting/ranking the choices, where two choices have identical values for all values, it nevertheless splits ties using some unknown mechanism that appears to go outside of the variables in the sort/rank by language.  It appears that the unknown mechanism is to look for differences in a variable that is defined by taking the absolute value of a difference and choosing the naturally positive value over the naturally negative value (i.e., the value before wiping away the sign).  For example, take a pair where the only difference is the LOTDIFF a variable which is defined as an absolute value of the difference between two values.  Even though the absolute values of the two choices are equal (in my case =1), if you remove the ABS from the variable calculation, the natural values of the two choices would be +1 and -1.  SAS consistently sorts/ranks the +1 observation over the -1 observation.  Does this ring a bell and is this mystery sorting mechanism  documented anywhere?  

 

LOTDIFF = ABS(USLOT - HMLOT);
 PROC SORT DATA = P2PMODELS OUT = P2PMODELS;
            BY &USMANF &USPRIM USLOT &US_TIME_PERIOD &YEARMONTHU &USMON 
               &USCONNUM NVMATCH &DIFCHAR LOTDIFF WNDORDER COSTDIFF;
        RUN;

        DATA P2PMODS P2PTOP5;
            SET P2PMODELS;
            BY &USMANF &USPRIM USLOT &US_TIME_PERIOD &YEARMONTHU &USMON 
			   &USCONNUM NVMATCH &DIFCHAR LOTDIFF WNDORDER COSTDIFF;

            IF FIRST.&USCONNUM THEN
               CHOICE = 0;

            CHOICE + 1;

            IF CHOICE = 1 THEN
                 OUTPUT P2PMODS;
            IF CHOICE LE 5 THEN
                 OUTPUT P2PTOP5;
        RUN;

 

1 ACCEPTED SOLUTION

Accepted Solutions
5 REPLIES 5
texasmfp
Lapis Lazuli | Level 10

I suppose another way to view this hidden mechanism is to look at the value of the one term that is changing in the calculated difference.

 

For example, USLOT=2 and HMLOT=1 for one observation, but HMLOT=3 for another observation, SAS's ranking choice is given to the HMLOT=1 over HMLOT=3 (which in the UN absoluted calculations would be +1 and -1, respectively).

 

And yes, I have manually swapped the HMLOT values on the observation pairs and it always picks the observation with +1 over the one -1 (or, alternatively the observation with HMLOT=1 over the HMLOT=3).

texasmfp
Lapis Lazuli | Level 10

Thanks Kurt, but in my case USLOT is the same for every observation.  So it is a non-factor in terms of sorting.

texasmfp
Lapis Lazuli | Level 10

Kurt:  thanks.  That is what happened.  About 5 datasteps back a database was sorted by HMLOT.  Since no sort/merge after that involved HMLOT, it retained that ordering in subsequent steps.  I tested it by placing descending in front of HMLOT in that initial sort and the ranking flipped.  Thanks

sas-innovate-white.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Early bird rate extended! Save $200 when you sign up by March 31.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1132 views
  • 0 likes
  • 2 in conversation