BookmarkSubscribeRSS Feed
tomcmacdonald
Quartz | Level 8

I would like to sort a dataset using PROC SORT based on a character column.  Something equivalent to:

 

PROC SQL;
  SELECT *
  FROM foo
  ORDER BY
    CASE some_text_column
    WHEN 'category a' THEN 1
    WHEN 'category b' THEN 2
    WHEN 'category c' THEN 3
  END;
QUIT;
13 REPLIES 13
PaigeMiller
Diamond | Level 26

Since you mention PROC SORT, have you tried it? It should get the job done very easily. Does it work? Why or why not?

 

Is the real question not about sorting but assigning values 1 2 3??

--
Paige Miller
tomcmacdonald
Quartz | Level 8

I'm just using dummy values.  The values won't be alphabetical.  I have tried it and it sorts alphabetically which is incorrect.  I'm guessing I'm going to have make a temporary column in a previous DATA STEP, sort it, then remove the column in another DATA STEP.

PaigeMiller
Diamond | Level 26

@tomcmacdonald wrote:

I'm just using dummy values.  The values won't be alphabetical.  I have tried it and it sorts alphabetically which is incorrect. 


There is not enough information here for me to understand what is not correct, and what result you want.

--
Paige Miller
tomcmacdonald
Quartz | Level 8

Just seeing if there's a less wordy way of doing the equivalent PROC SQL operation.  I think I have a solution.  This is the test dataset:

 

 

data foo;
	input a b $;
	datalines;
1 apple
2 banana
3 pear
4 peach
5 orange
;
run;

 

This is the PROC SQL sort solution:

 

 

proc sql;
	select *
	from foo 
	order by
		case b
		when 'peach' then 1
		when 'orange' then 2
		when 'apple' then 3
		when 'pear' then 4
		when 'banana' then 5;
quit;

 

This is the wordy DATA STEP and PROC SORT solution:

 

 

data _1;
	set foo;
	if b = 'peach' then dummy = 1;
	if b = 'orange' then dummy = 2;
	if b = 'apple' then dummy = 3;
	if b = 'pear' then dummy = 4;
	if b = 'banana' then dummy = 5;
run;

proc sort data=_1; by dummy; run;

data foo(keep=a b);
	set _1;
run;

proc delete data=_1; run;
PaigeMiller
Diamond | Level 26
proc sort data=_1 out=foo(drop=dummy); by dummy; run;
--
Paige Miller
ballardw
Super User

@tomcmacdonald wrote:

I'm just using dummy values.  The values won't be alphabetical.  I have tried it and it sorts alphabetically which is incorrect.  I'm guessing I'm going to have make a temporary column in a previous DATA STEP, sort it, then remove the column in another DATA STEP.


Show the code you used that resulted in an incorrect sort order for numeric variables.

 

Also indicate what format the numeric variable(s) that sort incorrectly may have. Note that if you have a format permanently applied to the variables the displayed value may not match the underlying numeric value.

Astounding
PROC Star

This will take at least two steps (just can't be done in one step).  To get a grasp on the problem ...

 

Do you know what 1, 2, and 3 are, or does the program have to figure out positionally what the names of the fields are?

 

Is FOO the name of the data set that contains both the data to be sorted as well as SOME_TEXT_COLUMN?

 

Wherever it comes from, is SOME_TEXT_COLUMN constant across records (or is there only one record to begin with)?

 

 

*********** EDITED:

 

Your example makes things much clearer.  You could combine a couple of steps:

 

proc sort data=_1 out=foo (keep=a b);

by dummy;

run;

 

But the DATA step approach will likely take longer to run than the original.

 

 

novinosrin
Tourmaline | Level 20

may be a hash data step equivalent?

 

data foo;
	input a b $;
	datalines;
1 apple
2 banana
3 pear
4 peach
5 orange
;
run;

data _null_;
if _n_=1 then do;
  dcl hash H (ordered: "A") ;
   h.definekey  ("dummy") ;
   h.definedata ("a", "b") ;
   h.definedone () ;
end;
set foo end=last;
if b = 'peach' then dummy = 1;
else if b = 'orange' then dummy = 2;
else if b = 'apple' then dummy = 3;
else if b = 'pear' then dummy = 4;
else if b = 'banana' then dummy = 5;
h.replace();
if last then h.output(dataset:'want');
run;
Reeza
Super User

Can you not do this in the first place?

Replace the categorical variables with a number that would sort accordingly and have a format that shows the names instead. 

 

I'm not aware of another method to do this in any language tbh. 

tomcmacdonald
Quartz | Level 8
Wouldn't that be more complicated because now you need to go into PROC FORMAT and define a format that will only be used once?
Reeza
Super User

It speaks to the structure of your data in the first place. 

 

Doing it from the start means your data is smaller and easier to manage.

It means when I'm coding, I'm typing less characters. 

PROC FORMAT can be driven from a data set so the code is centralized. 

 

In many ways, it's just better programming approach overall. 

 

PROC FORMAT is also very fast for lookups and I have more control over my reports. 

 

Obviously my opinion, but I work with SAS/R/Python and this is one of the best approaches in SAS, IMO

ballardw
Super User

In addition to @Reeza's format comments it may be one of the easier ways to reorder data depending on what you are actually using the data for. Note that reporting procedures Report and Tabulate explicitly have order options for variables and include Formatted as one of the options.

Ksharp
Super User

Since ORDER BY is executed at last. 

Maybe you should try .

 

PROC SQL;
  SELECT *,
CASE some_text_column WHEN 'category a' THEN 1 WHEN 'category b' THEN 2 WHEN 'category c' THEN 3 END AS XXXX
FROM foo ORDER BY XXXX ; QUIT;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 2893 views
  • 0 likes
  • 7 in conversation