Hello,
I would like to do the following:
Create a frequency report that count number of customers in each group.
I would like to add the following condtion :
1-I want that the most frequent group will be displayed first and so on (Desending order).
2-I want to display only 6 groups (First 5 groups and the 6th group will get the total count for all groups that are ranked above 5th group)
For example:
The frequency for each group is:
c 10
b 8
a 6
d 6
e 5
f 3
g 2
h 1
i 1
j 1
k 1
l 1
m 2
n 1
o 2
So wanted data set should be :
c 10
b 8
a 6
d 6
e 5
Other 15
What is the way to do it please? (Get 6 groups)?
Data have;
Input ID group $;
cards;
1 c
2 c
3 c
4 c
5 c
6 c
7 c
8 c
9 c
10 c
11 b
12 b
13 b
14 b
15 b
16 b
17 b
18 b
19 a
20 a
21 a
22 a
23 a
24 a
25 d
26 d
27 d
28 d
29 d
30 d
31 e
32 e
33 e
34 e
35 e
36 f
37 f
38 f
39 g
40 g
41 h
42 i
43 j
44 k
45 l
46 m
47 m
48 n
49 o
50 o
;
Run;
Data wanted;
input group $ count;
cards;
c 10
b 8
a 6
d 6
e 5
Other 15
;
Run;
I think that this solution can work well but maybe there is a better way (shorter?)
proc sql;
create table t1 AS
select group,
count(*) as nr
from have
group by group
order by nr desc
;
quit;
data t2;
Serial + 1 ;
set t1;
run ;
proc sql;
create table t3 AS
select 'Other' as group,
count(*) as nr
from t2
where Serial >5
;
quit;
Data wanted;
set t2(where=(Serial <=5) t3 ;
Run;
@Ronein wrote:
Hello,
I would like to do the following:
Create a frequency report that count number of customers in each group.
I would like to add the following condtion :
1-I want that the most frequent group will be displayed first and so on (Desending order).
2-I want to display only 6 groups (First 5 groups and the 6th group will get the total count for all groups that are ranked above 5th group)
For example:
The frequency for each group is:
c 10
b 8
a 6
d 6
e 5f 3
g 2
h 1
i 1
j 1
k 1
l 1
m 2
n 1
o 2
So wanted data set should be :
c 10
b 8
a 6
d 6
e 5Other 15
What is the way to do it please? (Get 6 groups)?
Data have; Input ID group $; cards; 1 c 2 c 3 c 4 c 5 c 6 c 7 c 8 c 9 c 10 c 11 b 12 b 13 b 14 b 15 b 16 b 17 b 18 b 19 a 20 a 21 a 22 a 23 a 24 a 25 d 26 d 27 d 28 d 29 d 30 d 31 e 32 e 33 e 34 e 35 e 36 f 37 f 38 f 39 g 40 g 41 h 42 i 43 j 44 k 45 l 46 m 47 m 48 n 49 o 50 o ; Run; Data wanted; input group $ count; cards; c 10 b 8 a 6 d 6 e 5 Other 15 ; Run;
data counts_fmt;
set counts (OBS=3) end=eof;
length label $10.;
fmtname = 'Age_fmt';
start = age;
label = put(age, 8. -l);
output;
if eof then do;
hlo = 'O';
label = 'Other';
output;
end;
run;
proc format cntlin=counts_fmt;
run;
Assuming I was doing this for an age variable and taking the top 3 instead of 6.
PROC PARETO does this for you, if you have SAS/QC licensed. Here is an example: https://documentation.sas.com/doc/en/pgmmvacdc/9.4/qcug/qcug_pareto_examples01.htm
You would want to use the MAXNCAT=5 option to get counts for the highest 5 categories, and then the remaining categories would be set to whatever name is specified by the OTHER= option.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: