BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Giovani
Obsidian | Level 7

In a specific column I have many repeated informations like:

 

red

blue

yellow

white

red

blue

yellow

red

blue

yellow

 

Any thought how could proceed to rename a cell when it has a low frequency? 

In the sample above I would like to rename  white as other since it has frequency 1.

 

Thanks,

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

It helps if you post directly into the forum, not everyone can download files. You'll get answers faster. 

 

*Sample data;
data Colours;
input colour$;
cards;
Blue
Yellow
Red
Yellow
Orange
Orange
Yellow
Orange
Blue
Gray
Orange
Yellow
White
Orange
Blue
White
White
Blue
Orange
White
Orange
Yellow
White
Blue
Orange
Yellow
Yellow
Orange
Orange
Yellow
;;;;
run;

*create table to create format;
proc freq data=Colours order=freq noprint;
    table colour / out=format_base;
run;

*create rules for format;
data colour_fmt;
    set format_base;

    length start label $20.;

    fmtname = 'colours_fmt';
    type='C';
    start=colour;

    if count <5 then label='Other';
        else label=colour;
run;

*create format;
proc format cntlin=colour_fmt;
run;

*Apply format;
proc freq data=colours order=freq;
table colour;
format colour $colours_fmt.;
run;

View solution in original post

5 REPLIES 5
Reeza
Super User

 

Try a FORMAT with Other category. 

 

Anything not in the main list will fall into the OTHER category, similar to a default ELSE. 

 

 

Giovani
Obsidian | Level 7

The spreadsheet has more then 100.000 lines and over 200 different variables. Any thought how could build the format code?

Reeza
Super User

Your question is vague, so you’re going to get vague responses. If you have a fairly specific question with sample input/output data it’s easier to help you out. 

 

You don’t have to type out a format manually, it can be data driven and you can use a CNTLIN dataset to specify the formats. 

 

 

Giovani
Obsidian | Level 7

I put a sample in the attached .sas file.

I think is more clear. If not please let me know

 

Reeza
Super User

It helps if you post directly into the forum, not everyone can download files. You'll get answers faster. 

 

*Sample data;
data Colours;
input colour$;
cards;
Blue
Yellow
Red
Yellow
Orange
Orange
Yellow
Orange
Blue
Gray
Orange
Yellow
White
Orange
Blue
White
White
Blue
Orange
White
Orange
Yellow
White
Blue
Orange
Yellow
Yellow
Orange
Orange
Yellow
;;;;
run;

*create table to create format;
proc freq data=Colours order=freq noprint;
    table colour / out=format_base;
run;

*create rules for format;
data colour_fmt;
    set format_base;

    length start label $20.;

    fmtname = 'colours_fmt';
    type='C';
    start=colour;

    if count <5 then label='Other';
        else label=colour;
run;

*create format;
proc format cntlin=colour_fmt;
run;

*Apply format;
proc freq data=colours order=freq;
table colour;
format colour $colours_fmt.;
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 731 views
  • 0 likes
  • 2 in conversation