BookmarkSubscribeRSS Feed
Ronein
Meteorite | Level 14

Hello,

I would like to do the following:

Create a frequency report that count number of customers in each group.

I would like to add the following condtion :

1-I want that the most frequent group will be displayed first and so on (Desending order).

2-I want to display only 6 groups (First 5 groups and the 6th group will get the total count for all groups that are ranked above 5th group)

For example:

The frequency for each group is:

c 10
b 8
a 6
d 6
e 5

f 3 

g 2

h 1

i 1

j 1

k 1

l 1 

m 2

n 1

o 2

So wanted data set should be :

c 10
b 8
a 6
d 6
e 5

Other 15

 

What is the way to do it please? (Get 6 groups)?


Data have;
Input ID  group $;
cards;
1 c
2 c
3 c
4 c
5 c
6 c
7 c
8 c
9 c
10 c
11 b
12 b
13 b
14 b
15 b
16 b
17 b
18 b
19 a
20 a
21 a
22 a
23 a
24 a
25 d
26 d
27 d
28 d
29 d
30 d
31 e
32 e
33 e
34 e
35 e
36 f
37 f
38 f
39 g
40 g
41 h
42  i
43 j
44 k
45 l
46 m
47 m
48 n
49 o
50 o
;
Run;

Data wanted;
input group $ count;
cards;
c 10
b 8
a 6
d 6
e 5
Other 15
;
Run;

 

7 REPLIES 7
Ronein
Meteorite | Level 14

I think that this solution can work well but maybe there is a better way (shorter?)

proc sql;
create table t1 AS
select group,
       count(*) as nr
from have
group by  group
order by nr desc
;
quit;

data t2;
Serial + 1 ;
set t1;
run ;

proc sql;
create table t3 AS
select 'Other' as group,
       count(*) as nr
from t2
where Serial >5
;
quit;

Data wanted;
set t2(where=(Serial <=5)  t3 ;
Run;

Reeza
Super User
  1. Do a proc freq and get summaries into a data set
  2. Sort by frequency to get top 6 versus others
  3. Create a custom format that groups the top N into individual categories and everything else goes to other. Use PROC FORMAT + CNTLIN to do this step
  4. Re-run Step1 applying custom format from #3

@Ronein wrote:

Hello,

I would like to do the following:

Create a frequency report that count number of customers in each group.

I would like to add the following condtion :

1-I want that the most frequent group will be displayed first and so on (Desending order).

2-I want to display only 6 groups (First 5 groups and the 6th group will get the total count for all groups that are ranked above 5th group)

For example:

The frequency for each group is:

c 10
b 8
a 6
d 6
e 5

f 3 

g 2

h 1

i 1

j 1

k 1

l 1 

m 2

n 1

o 2

So wanted data set should be :

c 10
b 8
a 6
d 6
e 5

Other 15

 

What is the way to do it please? (Get 6 groups)?


Data have;
Input ID  group $;
cards;
1 c
2 c
3 c
4 c
5 c
6 c
7 c
8 c
9 c
10 c
11 b
12 b
13 b
14 b
15 b
16 b
17 b
18 b
19 a
20 a
21 a
22 a
23 a
24 a
25 d
26 d
27 d
28 d
29 d
30 d
31 e
32 e
33 e
34 e
35 e
36 f
37 f
38 f
39 g
40 g
41 h
42  i
43 j
44 k
45 l
46 m
47 m
48 n
49 o
50 o
;
Run;

Data wanted;
input group $ count;
cards;
c 10
b 8
a 6
d 6
e 5
Other 15
;
Run;

 


 

Ronein
Meteorite | Level 14
Thanks, may you please show code for your solution?
Reeza
Super User
I'm fairly certain you can figure out the code 🙂
If you need help post your code and issues.

Not sure it's any more efficient than your current solution though.
Ronein
Meteorite | Level 14
May you please show step 3?
Reeza
Super User
data counts_fmt;
set counts (OBS=3) end=eof;
length label $10.;

fmtname = 'Age_fmt';
start = age;
label = put(age, 8. -l);

output;
if eof then do;
hlo = 'O';
label = 'Other';
output;
end;

run;

proc format cntlin=counts_fmt;
run;

Assuming I was doing this for an age variable and taking the top 3 instead of 6.

PaigeMiller
Diamond | Level 26

PROC PARETO does this for you, if you have SAS/QC licensed. Here is an example: https://documentation.sas.com/doc/en/pgmmvacdc/9.4/qcug/qcug_pareto_examples01.htm

 

You would want to use the MAXNCAT=5 option to get counts for the highest 5 categories, and then the remaining categories would be set to whatever name is specified by the OTHER= option.

--
Paige Miller

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 458 views
  • 3 likes
  • 3 in conversation