DATA Step, Macro, Functions and more

How to rename a cell if it has a low frequency

Accepted Solution Solved
Reply
Contributor
Posts: 35
Accepted Solution

How to rename a cell if it has a low frequency

In a specific column I have many repeated informations like:

 

red

blue

yellow

white

red

blue

yellow

red

blue

yellow

 

Any thought how could proceed to rename a cell when it has a low frequency? 

In the sample above I would like to rename  white as other since it has frequency 1.

 

Thanks,

 


Accepted Solutions
Solution
‎11-27-2017 09:10 AM
Super User
Posts: 24,010

Re: How to rename a cell if it has a low frequency

It helps if you post directly into the forum, not everyone can download files. You'll get answers faster. 

 

*Sample data;
data Colours;
input colour$;
cards;
Blue
Yellow
Red
Yellow
Orange
Orange
Yellow
Orange
Blue
Gray
Orange
Yellow
White
Orange
Blue
White
White
Blue
Orange
White
Orange
Yellow
White
Blue
Orange
Yellow
Yellow
Orange
Orange
Yellow
;;;;
run;

*create table to create format;
proc freq data=Colours order=freq noprint;
    table colour / out=format_base;
run;

*create rules for format;
data colour_fmt;
    set format_base;

    length start label $20.;

    fmtname = 'colours_fmt';
    type='C';
    start=colour;

    if count <5 then label='Other';
        else label=colour;
run;

*create format;
proc format cntlin=colour_fmt;
run;

*Apply format;
proc freq data=colours order=freq;
table colour;
format colour $colours_fmt.;
run;

View solution in original post


All Replies
Super User
Posts: 24,010

Re: How to rename a cell if it has a low frequency

 

Try a FORMAT with Other category. 

 

Anything not in the main list will fall into the OTHER category, similar to a default ELSE. 

 

 

Contributor
Posts: 35

Re: How to rename a cell if it has a low frequency

The spreadsheet has more then 100.000 lines and over 200 different variables. Any thought how could build the format code?

Super User
Posts: 24,010

Re: How to rename a cell if it has a low frequency

Your question is vague, so you’re going to get vague responses. If you have a fairly specific question with sample input/output data it’s easier to help you out. 

 

You don’t have to type out a format manually, it can be data driven and you can use a CNTLIN dataset to specify the formats. 

 

 

Contributor
Posts: 35

Re: How to rename a cell if it has a low frequency

I put a sample in the attached .sas file.

I think is more clear. If not please let me know

 

Attachment
Solution
‎11-27-2017 09:10 AM
Super User
Posts: 24,010

Re: How to rename a cell if it has a low frequency

It helps if you post directly into the forum, not everyone can download files. You'll get answers faster. 

 

*Sample data;
data Colours;
input colour$;
cards;
Blue
Yellow
Red
Yellow
Orange
Orange
Yellow
Orange
Blue
Gray
Orange
Yellow
White
Orange
Blue
White
White
Blue
Orange
White
Orange
Yellow
White
Blue
Orange
Yellow
Yellow
Orange
Orange
Yellow
;;;;
run;

*create table to create format;
proc freq data=Colours order=freq noprint;
    table colour / out=format_base;
run;

*create rules for format;
data colour_fmt;
    set format_base;

    length start label $20.;

    fmtname = 'colours_fmt';
    type='C';
    start=colour;

    if count <5 then label='Other';
        else label=colour;
run;

*create format;
proc format cntlin=colour_fmt;
run;

*Apply format;
proc freq data=colours order=freq;
table colour;
format colour $colours_fmt.;
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 129 views
  • 0 likes
  • 2 in conversation