BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
erdem_ustun
Obsidian | Level 7

Hi,

 

I have a sample dataset with the following columns.
To compare all the columns
I want to do comparison with the data step.
Values can be empty or string or numeric

 

data have;
informat id v1-v12 $8.;
input id v1-v12;
cards;
1 1 1 1 3 a b . 4 x . 2 2
2 4 5 h k L L A a 4 6 7 c
3 3 4 5 6 7 3 6 8 8 a w d
4 6 7 2 1 1 2 3 4 5 6 7 s
;

 

There are data for 12 months.I want to compare them in order.

 

cats(ifc(v1=v2,'0','1')
cats(ifc(v2=v3,'0','1')
cats(ifc(v3=v4,'0','1')
cats(ifc(v4=v5,'0','1')
cats(ifc(v5=v6,'0','1')
cats(ifc(v6=v7,'0','1')
cats(ifc(v7=v8,'0','1')
cats(ifc(v8=v9,'0','1')
cats(ifc(v9=v10,'0','1')
cats(ifc(v10=v11,'0','1')
cats(ifc(v11=v12,'0','1')

 

At the end, a concatenated string will be formed.

 

For id = 1 concat_compare='0;0;1;1;1;1;1;1;1;1;0' and

 

same_count=7 /* I have 3 item for 1, 2 item for 2  and 2 item  for missing ,sum(3,2,2) */
not_same_count=5 

 

same_values=1;2; .; /*distinct values, Although it is not the same but more than one / If empty record exists it must be listed  */

 

not_same_values=3;a;b;4;x /*distinct values only have one */

 

I found a code to find the number of the same on the forum.

 

data want (drop=i j);
set have;
array vars $8. v1-v12;
same=0;
do i=1 to dim(vars)-1;
do j=i+1 to dim(vars);
same=sum (same, vars(i) eq vars(j));
end;
end;
run;

But I could not do the others.

I need the following fields on a record-based. 

concat_compare   same_count    not_same_count   same_values    not_same_values

 

Is it possible to do this for each record/id?

 

Best regards

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, your concat problem is simple and well explained, so code below.  For your other four variables however, I didn't follow at all - please clarify.

data have;
informat id v1-v12 $8.;
input id v1-v12;
cards;
1 1 1 1 3 a b . 4 x . 2 2
2 4 5 h k L L A a 4 6 7 c
3 3 4 5 6 7 3 6 8 8 a w d
4 6 7 2 1 1 2 3 4 5 6 7 s
;
run;

data want;
  set have;
  length concat_compare $2000;
  array vals{*} v:;
  do i=1 to dim(vals)-1;
    concat_compare=catx(';',concat_compare,ifc(vals{i}=vals{i+1},"1","0")); 
  end;
run;

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, your concat problem is simple and well explained, so code below.  For your other four variables however, I didn't follow at all - please clarify.

data have;
informat id v1-v12 $8.;
input id v1-v12;
cards;
1 1 1 1 3 a b . 4 x . 2 2
2 4 5 h k L L A a 4 6 7 c
3 3 4 5 6 7 3 6 8 8 a w d
4 6 7 2 1 1 2 3 4 5 6 7 s
;
run;

data want;
  set have;
  length concat_compare $2000;
  array vals{*} v:;
  do i=1 to dim(vals)-1;
    concat_compare=catx(';',concat_compare,ifc(vals{i}=vals{i+1},"1","0")); 
  end;
run;
erdem_ustun
Obsidian | Level 7

Hi 

 

thanks for your reply.I will try to explain the other four variables.

 

 mean of  same_values :It is a list of multiple recurring values when we accept each record as an array. so for id=1 ,

The same_values will be "1;2; . "

same_values="1;2; . " (missing is include)

 

mean of not_same_values:A list of values that do not repeat more than once when we accept each record as an array.

so for id=1 , The not_same_values will be "3;a;b;4;x"

 

 not_same_values="3;a;b;4;x"

 

same_count:The number of times that multiple repeating values are repeated. It is the sum of the number of repeats

for example ;

for id = 1 , Repeated 1 to 3 times,Repeated missing value(.) to 2 times  Repeated 2 to 2 times and Their sum is 7 (3+2+2)

same_count=7

 

not_same_count:The number of variables that do not repeat more than once

for id = 1,  Repeated 3 to 1 times,

                 Repeated a to 1 times,

                 Repeated b to 1 times,

                Repeated 4 to 1 times,

                Repeated x to 1 times 

and Their sum is 5 (1+1+1+1+1) 

not_same_count=5

 

I hope,It is more descriptive.

Best regards

ballardw
Super User

for Id 1 V3=1 and v4=3. So wouldn't the result be

0;0;1;1;1;1;1;1;1;1;0

and not your example

0;0;0;1;1;1;1;1;1;1;0

erdem_ustun
Obsidian | Level 7

Yes, Yes, I made a mistake.Just like you wrote it.

 

0;0;1;1;1;1;1;1;1;1;0

thanks

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2283 views
  • 1 like
  • 3 in conversation