BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
erdem_ustun
Obsidian | Level 7

Hi,

 

I have written a problem before.(compare multiple columns and concat )

 

Thank you very much rw9 for finding a solution .

data have;
informat id v1-v12 $8.;
input id v1-v12;
cards;
1 1 1 1 3 a b . 4 x . 2 2
2 4 5 h k L L A a 4 6 7 c
3 3 4 5 6 7 3 6 8 8 a w d
4 6 7 2 1 1 2 3 4 5 6 7 s
;
run;

data want;
  set have;
  length concat_compare $2000;
  array vals{*} v:;
  do i=1 to dim(vals)-1;
    concat_compare=catx(';',concat_compare,ifc(vals{i}=vals{i+1},"0","1")); 
  end;
run;

rw9 

But I could not do the other things I wanted to do.

 

Is it possible to find the frequency of the values in array?

Is it possible to find the distinct of values in array?

I would ask your assistance in this matter.

The output I want to create is down.

Untitled.jpg

 

Best regards

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

What you will likely find is that changing your data structure will be the simplest method.  This shows how to get unique values:

data have;
  informat id v1-v12 $8.;
  input id v1-v12;
  /* Add a row identifier */
  id=_n_;
cards;
1 1 1 1 3 a b . 4 x . 2 2
2 4 5 h k L L A a 4 6 7 c
3 3 4 5 6 7 3 6 8 8 a w d
4 6 7 2 1 1 2 3 4 5 6 7 s
;
run;
proc transpose data=have out=inter;
  by id;
  var v:;
run;
proc sort data=inter out=unique_values;
  by id col1;
run;
data unique_values;
  set unique_values;
  length res $200;
  retain res;
  by id;
  res=ifc(first.id,col1,catx(',',res,col1));
  if last.id then output;
run;

With the inter dataset you can select counts of each and so on, then merge that data back to the one you already have.  In fact, if you look at it once you have done that you will likely find all the information can be done in one pass of a datastep, but start by doing each bit separate. 

 

Oh, and mark a post as correct if you have your answer (from the other post!).

View solution in original post

7 REPLIES 7
RW9
Diamond | Level 26 RW9
Diamond | Level 26

What you will likely find is that changing your data structure will be the simplest method.  This shows how to get unique values:

data have;
  informat id v1-v12 $8.;
  input id v1-v12;
  /* Add a row identifier */
  id=_n_;
cards;
1 1 1 1 3 a b . 4 x . 2 2
2 4 5 h k L L A a 4 6 7 c
3 3 4 5 6 7 3 6 8 8 a w d
4 6 7 2 1 1 2 3 4 5 6 7 s
;
run;
proc transpose data=have out=inter;
  by id;
  var v:;
run;
proc sort data=inter out=unique_values;
  by id col1;
run;
data unique_values;
  set unique_values;
  length res $200;
  retain res;
  by id;
  res=ifc(first.id,col1,catx(',',res,col1));
  if last.id then output;
run;

With the inter dataset you can select counts of each and so on, then merge that data back to the one you already have.  In fact, if you look at it once you have done that you will likely find all the information can be done in one pass of a datastep, but start by doing each bit separate. 

 

Oh, and mark a post as correct if you have your answer (from the other post!).

erdem_ustun
Obsidian | Level 7

thanks for your reply.

 

Result when I run the code you wrote for unique_values

new_code.jpg

 

But the necessary values are as follows

required.jpg

 

best regards.

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Yep, sorry, missed nodupkey:

proc sort data=inter out=unique_values nodupkey;
  by id col1;
run;

 

erdem_ustun
Obsidian | Level 7

Do we have a chance to find repeat records and repeat times? like as

freq_dup.jpg

best regards..

ballardw
Super User

Please display results in a form that is actually readable. The fonts in the image don't render well (at least on my monitors)

erdem_ustun
Obsidian | Level 7

Hi,

I'm copying it down.I also add it as excel.

 

id v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 concat_compare same_variables Distinct_Values Cnt_of_Distinct_Values dist_1 cnt_of_1 dist_2 cnt_of_2 dist_3 cnt_of_3 dist_4 cnt_of_4 dist_5 cnt_of_5
1 1 1 1 3 a b   4     2 2 0;0;1;1;1;1;1;1;0;1;0 v1_v2;v2_v3;v9_v10;v11_v12 1;3;a;b;4;2;missing 6 1 3 2 2 missing 2 3;a;b;4 1    
2 4 5 h k L L A a 4 6 7 c 1;1;1;1;0;1;1;1;1;1;1 v5_v6 4;5;h;k;L;A;a;6;7;c 10 4 2 L 2 5;h;k;A;a;6;7;c 1        
3 3 4 5 6 7 3 6 8 8 a w d 1;1;1;1;1;1;1;0;1;1;1 v8_v9 3;4;5;6;7;8;a;w;d 9 3 2 6 2 8 2 4;5;7;a;w;d 1    
4 6 7 2 1 1 2 3 4 5 6 7 s 1;1;1;0;1;1;1;1;1;1;1 v4_v5 6;7;2;1;3;4;5;s 8 1 2 2 2 6 2 7 2 3;4;5;s 1
erdem_ustun
Obsidian | Level 7

 

I could not find any solution to my problem.I found a code that was written in this forum before.But it only calculates one the maximum value.It does not calculate the frequencies of the others. Does not calculate if there is more than one maximum.

I copy down to give you an idea.

 

data have;
input id 1 x1 $ 4-5 x2 $ 7-8 x3 $ 10-11 x4 $ 13-14 x5 $ 16-17;
cards;
1 07 04 07 07 07
2 04 05 04 04 05
3 02 02 03
4 01 01 02 02
5 01
;
run;


data want;
set have;
length MostFreq $2;
array x x:;
array _t[10] _temporary_;
call missing(of _t[*]);
do _n_=1 to dim(x);
if x[_n_] ne ' ' then _t[input(x[_n_],2.)]+1;
end;
Count=max(of _t[*]);
MostFreq=put(whichn(Count, of _t[*]),z2.);
run;

 

How can I get the output of my example as below..

I ask for your help

 

data have;
  informat id v1-v12 $8.;
  input id v1-v12;
  /* Add a row identifier */
  id=_n_;
cards;
1 1 1 1 3 a b . 4 x . 2 2
2 4 5 h k L L A a 4 6 7 c
3 3 4 5 6 7 3 6 8 8 a w d
4 6 7 2 1 1 2 3 4 5 6 7 s
;
run;

 

id v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 concat_compare same_variables Distinct_Values Cnt_of_Distinct_Values dist_1 cnt_of_1 dist_2 cnt_of_2 dist_3 cnt_of_3 dist_4 cnt_of_4 dist_5 cnt_of_5
1 1 1 1 3 a b   4     2 2 0;0;1;1;1;1;1;1;0;1;0 v1_v2;v2_v3;v9_v10;v11_v12 1;3;a;b;4;2;missing 7 1 3 2 2 missing 2 3;a;b;4 1    
2 4 5 h k L L A a 4 6 7 c 1;1;1;1;0;1;1;1;1;1;1 v5_v6 4;5;h;k;L;A;a;6;7;c 10 4 2 L 2 5;h;k;A;a;6;7;c 1        
3 3 4 5 6 7 3 6 8 8 a w d 1;1;1;1;1;1;1;0;1;1;1 v8_v9 3;4;5;6;7;8;a;w;d 9 3 2 6 2 8 2 4;5;7;a;w;d 1    
4 6 7 2 1 1 2 3 4 5 6 7 s 1;1;1;0;1;1;1;1;1;1;1 v4_v5 6;7;2;1;3;4;5;s 8 1 2 2 2 6 2 7 2 3;4;5;s 1

best regards

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1259 views
  • 0 likes
  • 3 in conversation