Hi,
I have written a problem before.(compare multiple columns and concat )
Thank you very much rw9 for finding a solution .
data have; informat id v1-v12 $8.; input id v1-v12; cards; 1 1 1 1 3 a b . 4 x . 2 2 2 4 5 h k L L A a 4 6 7 c 3 3 4 5 6 7 3 6 8 8 a w d 4 6 7 2 1 1 2 3 4 5 6 7 s ; run; data want; set have; length concat_compare $2000; array vals{*} v:; do i=1 to dim(vals)-1; concat_compare=catx(';',concat_compare,ifc(vals{i}=vals{i+1},"0","1")); end; run;
But I could not do the other things I wanted to do.
Is it possible to find the frequency of the values in array?
Is it possible to find the distinct of values in array?
I would ask your assistance in this matter.
The output I want to create is down.
Best regards
What you will likely find is that changing your data structure will be the simplest method. This shows how to get unique values:
data have; informat id v1-v12 $8.; input id v1-v12; /* Add a row identifier */ id=_n_; cards; 1 1 1 1 3 a b . 4 x . 2 2 2 4 5 h k L L A a 4 6 7 c 3 3 4 5 6 7 3 6 8 8 a w d 4 6 7 2 1 1 2 3 4 5 6 7 s ; run; proc transpose data=have out=inter; by id; var v:; run; proc sort data=inter out=unique_values; by id col1; run; data unique_values; set unique_values; length res $200; retain res; by id; res=ifc(first.id,col1,catx(',',res,col1)); if last.id then output; run;
With the inter dataset you can select counts of each and so on, then merge that data back to the one you already have. In fact, if you look at it once you have done that you will likely find all the information can be done in one pass of a datastep, but start by doing each bit separate.
Oh, and mark a post as correct if you have your answer (from the other post!).
What you will likely find is that changing your data structure will be the simplest method. This shows how to get unique values:
data have; informat id v1-v12 $8.; input id v1-v12; /* Add a row identifier */ id=_n_; cards; 1 1 1 1 3 a b . 4 x . 2 2 2 4 5 h k L L A a 4 6 7 c 3 3 4 5 6 7 3 6 8 8 a w d 4 6 7 2 1 1 2 3 4 5 6 7 s ; run; proc transpose data=have out=inter; by id; var v:; run; proc sort data=inter out=unique_values; by id col1; run; data unique_values; set unique_values; length res $200; retain res; by id; res=ifc(first.id,col1,catx(',',res,col1)); if last.id then output; run;
With the inter dataset you can select counts of each and so on, then merge that data back to the one you already have. In fact, if you look at it once you have done that you will likely find all the information can be done in one pass of a datastep, but start by doing each bit separate.
Oh, and mark a post as correct if you have your answer (from the other post!).
thanks for your reply.
Result when I run the code you wrote for unique_values
But the necessary values are as follows
best regards.
Yep, sorry, missed nodupkey:
proc sort data=inter out=unique_values nodupkey; by id col1; run;
Do we have a chance to find repeat records and repeat times? like as
best regards..
Please display results in a form that is actually readable. The fonts in the image don't render well (at least on my monitors)
Hi,
I'm copying it down.I also add it as excel.
id | v1 | v2 | v3 | v4 | v5 | v6 | v7 | v8 | v9 | v10 | v11 | v12 | concat_compare | same_variables | Distinct_Values | Cnt_of_Distinct_Values | dist_1 | cnt_of_1 | dist_2 | cnt_of_2 | dist_3 | cnt_of_3 | dist_4 | cnt_of_4 | dist_5 | cnt_of_5 | … |
1 | 1 | 1 | 1 | 3 | a | b | 4 | 2 | 2 | 0;0;1;1;1;1;1;1;0;1;0 | v1_v2;v2_v3;v9_v10;v11_v12 | 1;3;a;b;4;2;missing | 6 | 1 | 3 | 2 | 2 | missing | 2 | 3;a;b;4 | 1 | … | |||||
2 | 4 | 5 | h | k | L | L | A | a | 4 | 6 | 7 | c | 1;1;1;1;0;1;1;1;1;1;1 | v5_v6 | 4;5;h;k;L;A;a;6;7;c | 10 | 4 | 2 | L | 2 | 5;h;k;A;a;6;7;c | 1 | … | ||||
3 | 3 | 4 | 5 | 6 | 7 | 3 | 6 | 8 | 8 | a | w | d | 1;1;1;1;1;1;1;0;1;1;1 | v8_v9 | 3;4;5;6;7;8;a;w;d | 9 | 3 | 2 | 6 | 2 | 8 | 2 | 4;5;7;a;w;d | 1 | … | ||
4 | 6 | 7 | 2 | 1 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | s | 1;1;1;0;1;1;1;1;1;1;1 | v4_v5 | 6;7;2;1;3;4;5;s | 8 | 1 | 2 | 2 | 2 | 6 | 2 | 7 | 2 | 3;4;5;s | 1 | … |
Hi,
I could not find any solution to my problem.I found a code that was written in this forum before.But it only calculates one the maximum value.It does not calculate the frequencies of the others. Does not calculate if there is more than one maximum.
I copy down to give you an idea.
data have;
input id 1 x1 $ 4-5 x2 $ 7-8 x3 $ 10-11 x4 $ 13-14 x5 $ 16-17;
cards;
1 07 04 07 07 07
2 04 05 04 04 05
3 02 02 03
4 01 01 02 02
5 01
;
run;
data want;
set have;
length MostFreq $2;
array x x:;
array _t[10] _temporary_;
call missing(of _t[*]);
do _n_=1 to dim(x);
if x[_n_] ne ' ' then _t[input(x[_n_],2.)]+1;
end;
Count=max(of _t[*]);
MostFreq=put(whichn(Count, of _t[*]),z2.);
run;
How can I get the output of my example as below..
I ask for your help
data have; informat id v1-v12 $8.; input id v1-v12; /* Add a row identifier */ id=_n_; cards; 1 1 1 1 3 a b . 4 x . 2 2 2 4 5 h k L L A a 4 6 7 c 3 3 4 5 6 7 3 6 8 8 a w d 4 6 7 2 1 1 2 3 4 5 6 7 s ; run;
id | v1 | v2 | v3 | v4 | v5 | v6 | v7 | v8 | v9 | v10 | v11 | v12 | concat_compare | same_variables | Distinct_Values | Cnt_of_Distinct_Values | dist_1 | cnt_of_1 | dist_2 | cnt_of_2 | dist_3 | cnt_of_3 | dist_4 | cnt_of_4 | dist_5 | cnt_of_5 | … |
1 | 1 | 1 | 1 | 3 | a | b | 4 | 2 | 2 | 0;0;1;1;1;1;1;1;0;1;0 | v1_v2;v2_v3;v9_v10;v11_v12 | 1;3;a;b;4;2;missing | 7 | 1 | 3 | 2 | 2 | missing | 2 | 3;a;b;4 | 1 | … | |||||
2 | 4 | 5 | h | k | L | L | A | a | 4 | 6 | 7 | c | 1;1;1;1;0;1;1;1;1;1;1 | v5_v6 | 4;5;h;k;L;A;a;6;7;c | 10 | 4 | 2 | L | 2 | 5;h;k;A;a;6;7;c | 1 | … | ||||
3 | 3 | 4 | 5 | 6 | 7 | 3 | 6 | 8 | 8 | a | w | d | 1;1;1;1;1;1;1;0;1;1;1 | v8_v9 | 3;4;5;6;7;8;a;w;d | 9 | 3 | 2 | 6 | 2 | 8 | 2 | 4;5;7;a;w;d | 1 | … | ||
4 | 6 | 7 | 2 | 1 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | s | 1;1;1;0;1;1;1;1;1;1;1 | v4_v5 | 6;7;2;1;3;4;5;s | 8 | 1 | 2 | 2 | 2 | 6 | 2 | 7 | 2 | 3;4;5;s | 1 | … |
best regards
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.