Solved
Contributor
Posts: 27

# The frequencies of the array values

Hi,

I have written a problem before.(compare multiple columns and concat )

Thank you very much rw9 for finding a solution .

```data have;
informat id v1-v12 \$8.;
input id v1-v12;
cards;
1 1 1 1 3 a b . 4 x . 2 2
2 4 5 h k L L A a 4 6 7 c
3 3 4 5 6 7 3 6 8 8 a w d
4 6 7 2 1 1 2 3 4 5 6 7 s
;
run;

data want;
set have;
length concat_compare \$2000;
array vals{*} v:;
do i=1 to dim(vals)-1;
concat_compare=catx(';',concat_compare,ifc(vals{i}=vals{i+1},"0","1"));
end;
run;```

But I could not do the other things I wanted to do.

Is it possible to find the frequency of the values in array?

Is it possible to find the distinct of values in array?

The output I want to create is down.

Best regards

Accepted Solutions
Solution
‎05-04-2017 02:27 AM
Super User
Posts: 9,423

## Re: The frequencies of the array values

[ Edited ]

What you will likely find is that changing your data structure will be the simplest method.  This shows how to get unique values:

```data have;
informat id v1-v12 \$8.;
input id v1-v12;
/* Add a row identifier */
id=_n_;
cards;
1 1 1 1 3 a b . 4 x . 2 2
2 4 5 h k L L A a 4 6 7 c
3 3 4 5 6 7 3 6 8 8 a w d
4 6 7 2 1 1 2 3 4 5 6 7 s
;
run;
proc transpose data=have out=inter;
by id;
var v:;
run;
proc sort data=inter out=unique_values;
by id col1;
run;
data unique_values;
set unique_values;
length res \$200;
retain res;
by id;
res=ifc(first.id,col1,catx(',',res,col1));
if last.id then output;
run;```

With the inter dataset you can select counts of each and so on, then merge that data back to the one you already have.  In fact, if you look at it once you have done that you will likely find all the information can be done in one pass of a datastep, but start by doing each bit separate.

Oh, and mark a post as correct if you have your answer (from the other post!).

All Replies
Solution
‎05-04-2017 02:27 AM
Super User
Posts: 9,423

## Re: The frequencies of the array values

[ Edited ]

What you will likely find is that changing your data structure will be the simplest method.  This shows how to get unique values:

```data have;
informat id v1-v12 \$8.;
input id v1-v12;
/* Add a row identifier */
id=_n_;
cards;
1 1 1 1 3 a b . 4 x . 2 2
2 4 5 h k L L A a 4 6 7 c
3 3 4 5 6 7 3 6 8 8 a w d
4 6 7 2 1 1 2 3 4 5 6 7 s
;
run;
proc transpose data=have out=inter;
by id;
var v:;
run;
proc sort data=inter out=unique_values;
by id col1;
run;
data unique_values;
set unique_values;
length res \$200;
retain res;
by id;
res=ifc(first.id,col1,catx(',',res,col1));
if last.id then output;
run;```

With the inter dataset you can select counts of each and so on, then merge that data back to the one you already have.  In fact, if you look at it once you have done that you will likely find all the information can be done in one pass of a datastep, but start by doing each bit separate.

Oh, and mark a post as correct if you have your answer (from the other post!).

Contributor
Posts: 27

## Re: The frequencies of the array values

Result when I run the code you wrote for unique_values

But the necessary values are as follows

best regards.

Super User
Posts: 9,423

## Re: The frequencies of the array values

Yep, sorry, missed nodupkey:

```proc sort data=inter out=unique_values nodupkey;
by id col1;
run;```

Contributor
Posts: 27

## Re: The frequencies of the array values

Do we have a chance to find repeat records and repeat times? like as

best regards..

Super User
Posts: 13,321

## Re: The frequencies of the array values

Please display results in a form that is actually readable. The fonts in the image don't render well (at least on my monitors)

Contributor
Posts: 27

## Re: The frequencies of the array values

Hi,

I'm copying it down.I also add it as excel.

 id v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 concat_compare same_variables Distinct_Values Cnt_of_Distinct_Values dist_1 cnt_of_1 dist_2 cnt_of_2 dist_3 cnt_of_3 dist_4 cnt_of_4 dist_5 cnt_of_5 … 1 1 1 1 3 a b 4 2 2 0;0;1;1;1;1;1;1;0;1;0 v1_v2;v2_v3;v9_v10;v11_v12 1;3;a;b;4;2;missing 6 1 3 2 2 missing 2 3;a;b;4 1 … 2 4 5 h k L L A a 4 6 7 c 1;1;1;1;0;1;1;1;1;1;1 v5_v6 4;5;h;k;L;A;a;6;7;c 10 4 2 L 2 5;h;k;A;a;6;7;c 1 … 3 3 4 5 6 7 3 6 8 8 a w d 1;1;1;1;1;1;1;0;1;1;1 v8_v9 3;4;5;6;7;8;a;w;d 9 3 2 6 2 8 2 4;5;7;a;w;d 1 … 4 6 7 2 1 1 2 3 4 5 6 7 s 1;1;1;0;1;1;1;1;1;1;1 v4_v5 6;7;2;1;3;4;5;s 8 1 2 2 2 6 2 7 2 3;4;5;s 1 …
Contributor
Posts: 27

## Re: The frequencies of the array values

[ Edited ]

I could not find any solution to my problem.I found a code that was written in this forum before.But it only calculates one the maximum value.It does not calculate the frequencies of the others. Does not calculate if there is more than one maximum.

I copy down to give you an idea.

data have;
input id 1 x1 \$ 4-5 x2 \$ 7-8 x3 \$ 10-11 x4 \$ 13-14 x5 \$ 16-17;
cards;
1 07 04 07 07 07
2 04 05 04 04 05
3 02 02 03
4 01 01 02 02
5 01
;
run;

data want;
set have;
length MostFreq \$2;
array x x:;
array _t[10] _temporary_;
call missing(of _t[*]);
do _n_=1 to dim(x);
if x[_n_] ne ' ' then _t[input(x[_n_],2.)]+1;
end;
Count=max(of _t[*]);
MostFreq=put(whichn(Count, of _t[*]),z2.);
run;

How can I get the output of my example as below..

```data have;
informat id v1-v12 \$8.;
input id v1-v12;
/* Add a row identifier */
id=_n_;
cards;
1 1 1 1 3 a b . 4 x . 2 2
2 4 5 h k L L A a 4 6 7 c
3 3 4 5 6 7 3 6 8 8 a w d
4 6 7 2 1 1 2 3 4 5 6 7 s
;
run;```

 id v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 concat_compare same_variables Distinct_Values Cnt_of_Distinct_Values dist_1 cnt_of_1 dist_2 cnt_of_2 dist_3 cnt_of_3 dist_4 cnt_of_4 dist_5 cnt_of_5 … 1 1 1 1 3 a b 4 2 2 0;0;1;1;1;1;1;1;0;1;0 v1_v2;v2_v3;v9_v10;v11_v12 1;3;a;b;4;2;missing 7 1 3 2 2 missing 2 3;a;b;4 1 … 2 4 5 h k L L A a 4 6 7 c 1;1;1;1;0;1;1;1;1;1;1 v5_v6 4;5;h;k;L;A;a;6;7;c 10 4 2 L 2 5;h;k;A;a;6;7;c 1 … 3 3 4 5 6 7 3 6 8 8 a w d 1;1;1;1;1;1;1;0;1;1;1 v8_v9 3;4;5;6;7;8;a;w;d 9 3 2 6 2 8 2 4;5;7;a;w;d 1 … 4 6 7 2 1 1 2 3 4 5 6 7 s 1;1;1;0;1;1;1;1;1;1;1 v4_v5 6;7;2;1;3;4;5;s 8 1 2 2 2 6 2 7 2 3;4;5;s 1 …

best regards

☑ This topic is solved.