BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
archibald
Obsidian | Level 7

Hi all, 

I ran a proc freq on a character variable  "country_name" and I got  frequencies  in alphabatical order. 

I would have been satisfy with this result, but I need to have "other" as the last category in my output.

Thank you, 

 

Here is an abbreviated  list
Afghanistan
Albania
Bahrain
Barbados
Belarus

Cambodia

Denmark
Dominican Republic
Ecuador
Egypt

Fiji
Finland

Oman

Other 

.....

Zambia

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

If you want to stick to your existing character variable, you could apply the ORDER= option of the PROC FREQ statement:

 

  1. One option is to use ORDER=DATA in conjunction with a sorted input dataset where observations with COUNTRY_NAME='Other' are sorted last. Example:
    proc sql;
    create view temp as
    select * from have
    order by country_name='Other', country_name;
    quit;
    
    proc freq data=temp order=data;
    tables country_name;
    run;
  2. Another option is to define a format for COUNTRY_NAME which sorts "Other" last and then use ORDER=FORMATTED. Unfortunately, PROC FREQ does not support the PRELOADFMT option that is available in PROC MEANS and PROC TABULATE. (With this you can specify the order of format categories in the PROC FORMAT statement.) So, to apply ORDER=FORMATTED, "Other" must be made the alphabetically last formatted value. Most easily, you could use the lower-case label "other". Otherwise, you could achieve the desired sort order by indenting all country names by a blank and "Other" by 'A0'x (non-breakable space). To avoid the indentation, you could try to prefix "Other" with an invisible character (with an ASCII code >90 [dec]), but this is platform and device dependent. On my Windows workstation the below format works with listing output (not HTML, ...):
    proc format;
    value $country
    'Other'='81'x 'Other' '01'x
    other=[$40.];
    run;
    
    proc freq data=have order=formatted;
    format country_name $country.;
    tables country_name;
    run;
    (The '01'x compensates the "outdentation" caused by the invisible '81'x character.) I would not recommend this for production purposes.

View solution in original post

10 REPLIES 10
Reeza
Super User

I would capture the output to a table, sort as desired and then print the results. I feel like there should be a shorter way, but can't think of it at the moment. 

 

proc freq data=have;
table countries/out=country_freq;
run;

data country_freq;
set country_freq;
if country='Other' then order=9999999;
else order=_n_;
run;

proc sort data=country_freq;
by order;
run;

proc print data=country_freq;
run;
archibald
Obsidian | Level 7

Thank you for your quick reply  Reeza.  I will give it a try.

 

archibald
Obsidian | Level 7

@Reeza thank you!

FreelanceReinh
Jade | Level 19

Hi @archibald,

 

Wouldn't it be acceptable to write "other" in lower case? This should move it to the end of the alphabetical list (provided that all country names start with capital letters).

archibald
Obsidian | Level 7

As a matter fact, this is pretty straigthfoward and interesting way of solving the issue, thanks! but the only problem is to format  "other' afterward so that it is in capital letter. 

archibald
Obsidian | Level 7

@FreelanceReinh, As a matter fact, this is pretty straigthfoward and interesting way of solving the issue, thanks! but the only problem is to format  "other' afterward so that it is in capital letter. 

FreelanceReinh
Jade | Level 19

@archibald wrote:

... but the only problem is to format  "other' afterward so that it is in capital letter. 


But this is an easy "problem":

proc format;
value $country (default=40)
'other'='Other';
run;

proc freq data=have;
format country_name $country.;
tables country_name;
run;

(The option "default=40" specifies the length of the format. You could adapt the value 40 to the length of the longest country name.)

 

 

FreelanceReinh
Jade | Level 19

I think, the cleanest way to control the order is to define a variable, say, COUNTRY which does not contain the country name, but a numeric or character code. If this variable replaced COUNTRY_NAME, you would even save disk space, because the code could be very short (possibly only 1 byte, definitely not more than 3!). Make sure that the code for "Other" is sorted last.

 

Then you would define a format assigning country names (and "Other") to the codes and assign it to variable COUNTRY. By default, the internal values are used to sort PROC FREQ output. So, with tables country; you can get an output showing the country names, sorted by the code you defined, hence with "Other" sorted last.

FreelanceReinh
Jade | Level 19

If you want to stick to your existing character variable, you could apply the ORDER= option of the PROC FREQ statement:

 

  1. One option is to use ORDER=DATA in conjunction with a sorted input dataset where observations with COUNTRY_NAME='Other' are sorted last. Example:
    proc sql;
    create view temp as
    select * from have
    order by country_name='Other', country_name;
    quit;
    
    proc freq data=temp order=data;
    tables country_name;
    run;
  2. Another option is to define a format for COUNTRY_NAME which sorts "Other" last and then use ORDER=FORMATTED. Unfortunately, PROC FREQ does not support the PRELOADFMT option that is available in PROC MEANS and PROC TABULATE. (With this you can specify the order of format categories in the PROC FORMAT statement.) So, to apply ORDER=FORMATTED, "Other" must be made the alphabetically last formatted value. Most easily, you could use the lower-case label "other". Otherwise, you could achieve the desired sort order by indenting all country names by a blank and "Other" by 'A0'x (non-breakable space). To avoid the indentation, you could try to prefix "Other" with an invisible character (with an ASCII code >90 [dec]), but this is platform and device dependent. On my Windows workstation the below format works with listing output (not HTML, ...):
    proc format;
    value $country
    'Other'='81'x 'Other' '01'x
    other=[$40.];
    run;
    
    proc freq data=have order=formatted;
    format country_name $country.;
    tables country_name;
    run;
    (The '01'x compensates the "outdentation" caused by the invisible '81'x character.) I would not recommend this for production purposes.
archibald
Obsidian | Level 7

@FreelanceReinh the first option is perfect! thanks so much!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1476 views
  • 0 likes
  • 3 in conversation