I have saw many questions similar but after read those posts I still need to ask.
My question:
there is a variable has around 100 values, from 8, 9, 111, 23, 456,457, ab1, ac2, abc9, d388, q34, s789, etc
I want to sort it to a way that I feel comfortable with, that is, it should sfirst start with, 8, then 9, then 23... then ab1, ac2, then d388...
the rule would be:
if all are numbers, the order should be from small to big, that is, 8 should be listed before 111;
those with letters will be after the numbers, and in the order with alphabit order, that is, after these numbers, ab1 should be first,
but even I sort it, it still lists 111 before 8. I then realize it might be because sas considers 1 is small then 8.
I thought I could add leading zero, I do not know how to do this becasue it is a mixture of numeric and alpha.
Thank you in advance for your advice and time!
Proc Sort has built-in features just what you want:
data have;
infile cards dlm=',';
input var$ @@;
cards;
8, 9, 111, 23, 456,457, ab1, ac2, abc9, d388, q34, s789,etc
;
proc sort data=have out=want sortseq=linguistic(NUMERIC_COLLATION=on) ;
by var;
run;
You could use the ANYALPHA function to determine if the text has alphabetic characters or not. If there are no alphabetic characters, then you can concatenate zeros in front of the numeric.
Proc Sort has built-in features just what you want:
data have;
infile cards dlm=',';
input var$ @@;
cards;
8, 9, 111, 23, 456,457, ab1, ac2, abc9, d388, q34, s789,etc
;
proc sort data=have out=want sortseq=linguistic(NUMERIC_COLLATION=on) ;
by var;
run;
If it's acceptable to add a variable just for sorting purposes, this might be easier than adding leading zeros:
if input(varname, ??8.) > . then sorting_var = right(varname);
else sorting_var = varname;
Then sort by SORTING_VAR instead of by the original variable name.
Sorry the formatting was so bad.
data have;
input value $char8.;
datalines;
ab1
8
111
d388
9
456
23
abc9
457
ac2
;
data wrk; set have;
length c $8 n $8;
c = prxchange('s/(\D*)(\d*)/$1/',-1,trim(value));
n = input(prxchange('s/(\D*)(\d*)/$2/',-1,trim(value)),8.);
run;
proc sort data=wrk out=want(drop=c n); by c n;
run;
data _null_; set want; put value;
run;
Result:
8
9
23
111
456
457
ab1
abc9
ac2
d388
Look up the PROC Sort options SORTSEQ with LINGUISTIC options. Learning that proc sort has these options may save you many problems in the future so I'm not providing an example.
Note that AFAIK only Proc Sort will create the order properly (though if you have mixed case letters there are still issues). Other procedures that produce output may or maynot maintain your desired order.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.