SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
UcheOkoro
Lapis Lazuli | Level 10

Hello,

Please I need help with SAS code for sorting a character variable in alphabetical order from A-Z.

14 REPLIES 14
PaigeMiller
Diamond | Level 26

PROC SORT will do this

 

I'm guessing that there is more to your question, as I am sure you knew this already. So please, instead of saying as little as possible on this subject, give a full complete explanation, with emphasis on FULL and emphasis on COMPLETE.

--
Paige Miller
UcheOkoro
Lapis Lazuli | Level 10

I have a character variable with drug names. However, there are levels with non-drug names eg w10001. I used the proc sort statement but it did not work. The data is not sorted in alphabetical order.

 

proc sort data=long_finder;
by DRUGID ;run;

 

ballardw
Super User

Look at the output from this and see if the values are "sorted";

Proc print data=long_finder;
   var drugid;
   format drugid;
run;

This will print the values without any custom format, which I am making a guess from some of your other posts that the Drugid may have a custom format applied.

 

The SORT order would not use the formatted value of the variable, only the actual value.

You do not say why you need the order. It might be that for display purposes you could use Proc Report and Define the order for the variable as Formatted using that variable as a Group or Order variable.

Tom
Super User Tom
Super User

PROC SORT will sort by the ASCII codes of the strings. So all lowercase letters come after all uppercase letters.

So a value like 'w10001' will come before 'z' but it will be after 'Z'.

 

If you want to sort without regard to case then make a new variable using UPCASE() function and sort on that variable instead.

PaigeMiller
Diamond | Level 26

@UcheOkoro wrote:

I have a character variable with drug names. However, there are levels with non-drug names eg w10001. I used the proc sort statement but it did not work. The data is not sorted in alphabetical order.

 

proc sort data=long_finder;
by DRUGID ;run;

 


Saying it did not work ... is not a FULL COMPLETE explanation, as we don't have a clue what didn't work. Saying you have non-drug names like w10001 is not a FULL COMPLETE explanation because I don't see how that is even relevant to sorting. Please, show us the data (or a portion of it) and then explain (FULL COMPLETE) what results you would like to see from this sort.

--
Paige Miller
Reeza
Super User

Yes it did, but it's likely not the order you want

 

Show what you want versus what proc sort did output. 

 


@UcheOkoro wrote:

I have a character variable with drug names. However, there are levels with non-drug names eg w10001. I used the proc sort statement but it did not work. The data is not sorted in alphabetical order.

 

proc sort data=long_finder;
by DRUGID ;run;

 


 

 

Reeza
Super User

That is default sort order in proc sort on a character variable.

 

proc sort data=sashelp.class out=class_sorted_by_name;
by name;
run;

@UcheOkoro wrote:

Hello,

Please I help with SAS code for sorting a character variable in alphabetical order from A-Z.


 

UcheOkoro
Lapis Lazuli | Level 10

Thank you for you response . I tried the proc sort statement but it did not work.

ballardw
Super User

@UcheOkoro wrote:

Thank you for you response . I tried the proc sort statement but it did not work.


Did not work is awful vague.

Are there errors in the log?: Post the code and log in a code box opened with the "</>" to maintain formatting of error messages.

No output? Post any log in a code box.

Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "</>" icon or attached as text to show exactly what you have and that we can test code against.

UcheOkoro
Lapis Lazuli | Level 10

Thank you for your response. The following are my code, the log output and portions of the data.

 

proc sort data=long_finder out=long_finder2;
by drugid;
run;

UcheOkoro_0-1655322502606.png

UcheOkoro_1-1655322551307.png

UcheOkoro_3-1655322664513.png

 

UcheOkoro_2-1655322576264.png

 


341215  proc sort data=long_finder out=long_finder2;
341216  by drugid;
341217  run;

NOTE: There were 6185010 observations read from the data set WORK.LONG_FINDER.
NOTE: The data set WORK.LONG_FINDER2 has 6185010 observations and 3 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           2.49 seconds
      cpu time            3.25 seconds

 

Reeza
Super User

Can you show those same screenshots but formatting the variable as $HEX32 format?

Do they match the hex codes below for the text?

 

Obs	drugID	                                drug
1	414D4C4F444950494E45202020202020	AMLODIPINE
2	4359434C4F42454E5A415052494E4520	CYCLOBENZAPRINE
3	49425550524F46454E20202020202020	IBUPROFEN
4	4E41202D204E6F2064727567206D656E	NA - No drug mentioned
ballardw
Super User

Did you try the Proc Print I suggested that suppresses the FORMAT for the variable?

If that output is too much than try:

 

Proc freq data=long_finder;
   tables drugid;
   format drugid;
run;

And show the output from that.

 

What exactly is the FORMAT assigned to the variable Drugid?

If it is not something like $40. , i.e. any name other than $ followed by a number, then there is a CUSTOM format assigned. Proc Sort will not use the formatted value to sort the data.

 

There are some options related to national language sort preferences and how values incorporating digits may be sorted for Proc Sort but they all work on the actual value of the variable.

FreelanceReinh
Jade | Level 19

Hello @UcheOkoro,

 

I think the output strongly suggests that ballardw nailed it: You are looking at formatted values which are sorted by the underlying internal DRUGID values (perhaps codes from a drug dictionary).

 

Then you may want to execute these two steps to obtain the alphabetical sort order by formatted values:

 

  1. Find out the relevant format name from PROC CONTENTS output.
    proc contents data=long_finder(keep=drugid);
    run;
  2. Replace "$drugfmt." in the code below by the real format name determined in step 1 (omit the leading dollar sign if DRUGID turns out to be a numeric variable) and run the code.
    data _tmp / view=_tmp;
    set long_finder;
    _seqno=_n_;
    run;
    
    proc sql;
    create table long_finder2(drop=_seqno) as
    select * from _tmp
    order by put(drugid, $drugfmt.), _seqno;
    drop view _tmp;
    quit;

 

Edit: To overcome the uppercase/lowercase problem mentioned by Tom you can add the UPCASE or UPPER function to the ORDER BY clause:

order by upper(put(drugid, $drugfmt.)), _seqno;

 

ballardw
Super User

ballardw_0-1655327114623.png

Remember this from your question about "Coverting character variables from wide to long"? If this LONG_FINDER data set is the result of that process it is almost certain that the values are still the codes that use this $MEDCODF. format and the underlying values are the same here:

ballardw_1-1655327284140.png

With -9 being the sorted value that means "NA - No drug mentioned".

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 8018 views
  • 4 likes
  • 6 in conversation