DATA Step, Macro, Functions and more

Count most Repeated observation

Reply
N/A
Posts: 0

Count most Repeated observation

Hi i have 5 variable i want the count of most observation occured

a b c d e
1 1 1 2 3
5 5 6 5 2
9 9 2 0 8


output

a b c d e most_obs count
1 1 1 2 3 1 3
5 5 6 5 2 5 2
9 9 2 0 8 2 9

in this first observation the mostly repedatd observation was 1 so the variable as most_obs as 1 and the count was 3 .like this i have 20,000 obs
Super User
Posts: 5,434

Re: Count most Repeated observation

Posted in reply to deleted_user
See if I get you right, but I think that you want the values (not observations) that has most occurrences on each row/observation, and the number of occurrences.
If this is the case, it seems that your sample output is a bit incorrect. Wouldn't rather be like:

a b c d e most_obs count
1 1 1 2 3 1 3
5 5 6 5 2 5 3
9 9 2 0 8 9 2

Maybe you can solve this via some data step programming. But a more generic way is to transpose the data, then use SQL with COUNT and HAVING to find out your new columns, and then join back to the original table. This means you will need an id column. If you don't have one, you can easily create it using _n_ in a data step.

/Linus
Data never sleeps
PROC Star
Posts: 1,760

Re: Count most Repeated observation

I agree with Linus, transposing would help and be valid with any number of values you could have. I reckon you need 3 steps:

1)transpose to:
ID VAL
1 1
1 1
1 1
1 2
1 3
2 5
..

then 2)count observations by ID and VAL (proc sql or means, output sorted by frequency),
and 3)keep the maximum count for each id(data step using the LAST. variable)
N/A
Posts: 0

Re: Count most Repeated observation

Posted in reply to deleted_user
As Linus and Chris mentioned this is the solution for your problem:

data test;
set test;
seq+1;
if seq ne 1;
run;

proc transpose data=test out=test1;
by seq;
var a b c d e;
run;

proc freq data=test1 noprint;
tables col1/out=test2;
by seq;
run;

proc sort data=test2;
by seq descending count;
run;

data test3;
set test2;
by seq;
if first.seq;
run;

data test4(drop=seq rename=(col1=most_obs));
merge test test3(keep=seq col1 count);
by seq;
run;

~ Sukanya E
N/A
Posts: 0

Re: Count most Repeated observation

Posted in reply to deleted_user
Thqs it worked .
Ask a Question
Discussion stats
  • 4 replies
  • 146 views
  • 0 likes
  • 3 in conversation