BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
erdem_ustun
Obsidian | Level 7

Hi,

 

I have a sample dataset with the following columns.
To compare all the columns
I want to do comparison with the data step.
Values can be empty or string or numeric

 

data have;
informat id v1-v12 $8.;
input id v1-v12;
cards;
1 1 1 1 3 a b . 4 x . 2 2
2 4 5 h k L L A a 4 6 7 c
3 3 4 5 6 7 3 6 8 8 a w d
4 6 7 2 1 1 2 3 4 5 6 7 s
;

 

There are data for 12 months.I want to compare them in order.

 

cats(ifc(v1=v2,'0','1')
cats(ifc(v2=v3,'0','1')
cats(ifc(v3=v4,'0','1')
cats(ifc(v4=v5,'0','1')
cats(ifc(v5=v6,'0','1')
cats(ifc(v6=v7,'0','1')
cats(ifc(v7=v8,'0','1')
cats(ifc(v8=v9,'0','1')
cats(ifc(v9=v10,'0','1')
cats(ifc(v10=v11,'0','1')
cats(ifc(v11=v12,'0','1')

 

At the end, a concatenated string will be formed.

 

For id = 1 concat_compare='0;0;1;1;1;1;1;1;1;1;0' and

 

same_count=7 /* I have 3 item for 1, 2 item for 2  and 2 item  for missing ,sum(3,2,2) */
not_same_count=5 

 

same_values=1;2; .; /*distinct values, Although it is not the same but more than one / If empty record exists it must be listed  */

 

not_same_values=3;a;b;4;x /*distinct values only have one */

 

I found a code to find the number of the same on the forum.

 

data want (drop=i j);
set have;
array vars $8. v1-v12;
same=0;
do i=1 to dim(vars)-1;
do j=i+1 to dim(vars);
same=sum (same, vars(i) eq vars(j));
end;
end;
run;

But I could not do the others.

I need the following fields on a record-based. 

concat_compare   same_count    not_same_count   same_values    not_same_values

 

Is it possible to do this for each record/id?

 

Best regards

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, your concat problem is simple and well explained, so code below.  For your other four variables however, I didn't follow at all - please clarify.

data have;
informat id v1-v12 $8.;
input id v1-v12;
cards;
1 1 1 1 3 a b . 4 x . 2 2
2 4 5 h k L L A a 4 6 7 c
3 3 4 5 6 7 3 6 8 8 a w d
4 6 7 2 1 1 2 3 4 5 6 7 s
;
run;

data want;
  set have;
  length concat_compare $2000;
  array vals{*} v:;
  do i=1 to dim(vals)-1;
    concat_compare=catx(';',concat_compare,ifc(vals{i}=vals{i+1},"1","0")); 
  end;
run;

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, your concat problem is simple and well explained, so code below.  For your other four variables however, I didn't follow at all - please clarify.

data have;
informat id v1-v12 $8.;
input id v1-v12;
cards;
1 1 1 1 3 a b . 4 x . 2 2
2 4 5 h k L L A a 4 6 7 c
3 3 4 5 6 7 3 6 8 8 a w d
4 6 7 2 1 1 2 3 4 5 6 7 s
;
run;

data want;
  set have;
  length concat_compare $2000;
  array vals{*} v:;
  do i=1 to dim(vals)-1;
    concat_compare=catx(';',concat_compare,ifc(vals{i}=vals{i+1},"1","0")); 
  end;
run;
erdem_ustun
Obsidian | Level 7

Hi 

 

thanks for your reply.I will try to explain the other four variables.

 

 mean of  same_values :It is a list of multiple recurring values when we accept each record as an array. so for id=1 ,

The same_values will be "1;2; . "

same_values="1;2; . " (missing is include)

 

mean of not_same_values:A list of values that do not repeat more than once when we accept each record as an array.

so for id=1 , The not_same_values will be "3;a;b;4;x"

 

 not_same_values="3;a;b;4;x"

 

same_count:The number of times that multiple repeating values are repeated. It is the sum of the number of repeats

for example ;

for id = 1 , Repeated 1 to 3 times,Repeated missing value(.) to 2 times  Repeated 2 to 2 times and Their sum is 7 (3+2+2)

same_count=7

 

not_same_count:The number of variables that do not repeat more than once

for id = 1,  Repeated 3 to 1 times,

                 Repeated a to 1 times,

                 Repeated b to 1 times,

                Repeated 4 to 1 times,

                Repeated x to 1 times 

and Their sum is 5 (1+1+1+1+1) 

not_same_count=5

 

I hope,It is more descriptive.

Best regards

ballardw
Super User

for Id 1 V3=1 and v4=3. So wouldn't the result be

0;0;1;1;1;1;1;1;1;1;0

and not your example

0;0;0;1;1;1;1;1;1;1;0

erdem_ustun
Obsidian | Level 7

Yes, Yes, I made a mistake.Just like you wrote it.

 

0;0;1;1;1;1;1;1;1;1;0

thanks

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1872 views
  • 1 like
  • 3 in conversation