Hello,
Please I need help with SAS code for sorting a character variable in alphabetical order from A-Z.
PROC SORT will do this
I'm guessing that there is more to your question, as I am sure you knew this already. So please, instead of saying as little as possible on this subject, give a full complete explanation, with emphasis on FULL and emphasis on COMPLETE.
I have a character variable with drug names. However, there are levels with non-drug names eg w10001. I used the proc sort statement but it did not work. The data is not sorted in alphabetical order.
proc sort data=long_finder;
by DRUGID ;run;
Look at the output from this and see if the values are "sorted";
Proc print data=long_finder; var drugid; format drugid; run;
This will print the values without any custom format, which I am making a guess from some of your other posts that the Drugid may have a custom format applied.
The SORT order would not use the formatted value of the variable, only the actual value.
You do not say why you need the order. It might be that for display purposes you could use Proc Report and Define the order for the variable as Formatted using that variable as a Group or Order variable.
PROC SORT will sort by the ASCII codes of the strings. So all lowercase letters come after all uppercase letters.
So a value like 'w10001' will come before 'z' but it will be after 'Z'.
If you want to sort without regard to case then make a new variable using UPCASE() function and sort on that variable instead.
@UcheOkoro wrote:
I have a character variable with drug names. However, there are levels with non-drug names eg w10001. I used the proc sort statement but it did not work. The data is not sorted in alphabetical order.
proc sort data=long_finder; by DRUGID ;run;
Saying it did not work ... is not a FULL COMPLETE explanation, as we don't have a clue what didn't work. Saying you have non-drug names like w10001 is not a FULL COMPLETE explanation because I don't see how that is even relevant to sorting. Please, show us the data (or a portion of it) and then explain (FULL COMPLETE) what results you would like to see from this sort.
Yes it did, but it's likely not the order you want.
Show what you want versus what proc sort did output.
@UcheOkoro wrote:
I have a character variable with drug names. However, there are levels with non-drug names eg w10001. I used the proc sort statement but it did not work. The data is not sorted in alphabetical order.
proc sort data=long_finder; by DRUGID ;run;
That is default sort order in proc sort on a character variable.
proc sort data=sashelp.class out=class_sorted_by_name;
by name;
run;
@UcheOkoro wrote:
Hello,
Please I help with SAS code for sorting a character variable in alphabetical order from A-Z.
Thank you for you response . I tried the proc sort statement but it did not work.
@UcheOkoro wrote:
Thank you for you response . I tried the proc sort statement but it did not work.
Did not work is awful vague.
Are there errors in the log?: Post the code and log in a code box opened with the "</>" to maintain formatting of error messages.
No output? Post any log in a code box.
Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "</>" icon or attached as text to show exactly what you have and that we can test code against.
Thank you for your response. The following are my code, the log output and portions of the data.
proc sort data=long_finder out=long_finder2;
by drugid;
run;
341215 proc sort data=long_finder out=long_finder2;
341216 by drugid;
341217 run;
NOTE: There were 6185010 observations read from the data set WORK.LONG_FINDER.
NOTE: The data set WORK.LONG_FINDER2 has 6185010 observations and 3 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 2.49 seconds
cpu time 3.25 seconds
Can you show those same screenshots but formatting the variable as $HEX32 format?
Do they match the hex codes below for the text?
Obs drugID drug 1 414D4C4F444950494E45202020202020 AMLODIPINE 2 4359434C4F42454E5A415052494E4520 CYCLOBENZAPRINE 3 49425550524F46454E20202020202020 IBUPROFEN 4 4E41202D204E6F2064727567206D656E NA - No drug mentioned
Did you try the Proc Print I suggested that suppresses the FORMAT for the variable?
If that output is too much than try:
Proc freq data=long_finder; tables drugid; format drugid; run;
And show the output from that.
What exactly is the FORMAT assigned to the variable Drugid?
If it is not something like $40. , i.e. any name other than $ followed by a number, then there is a CUSTOM format assigned. Proc Sort will not use the formatted value to sort the data.
There are some options related to national language sort preferences and how values incorporating digits may be sorted for Proc Sort but they all work on the actual value of the variable.
Hello @UcheOkoro,
I think the output strongly suggests that ballardw nailed it: You are looking at formatted values which are sorted by the underlying internal DRUGID values (perhaps codes from a drug dictionary).
Then you may want to execute these two steps to obtain the alphabetical sort order by formatted values:
proc contents data=long_finder(keep=drugid);
run;
data _tmp / view=_tmp; set long_finder; _seqno=_n_; run; proc sql; create table long_finder2(drop=_seqno) as select * from _tmp order by put(drugid, $drugfmt.), _seqno; drop view _tmp; quit;
Edit: To overcome the uppercase/lowercase problem mentioned by Tom you can add the UPCASE or UPPER function to the ORDER BY clause:
order by upper(put(drugid, $drugfmt.)), _seqno;
Remember this from your question about "Coverting character variables from wide to long"? If this LONG_FINDER data set is the result of that process it is almost certain that the values are still the codes that use this $MEDCODF. format and the underlying values are the same here:
With -9 being the sorted value that means "NA - No drug mentioned".
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.