Hi,
I have a sample dataset with the following columns.
To compare all the columns
I want to do comparison with the data step.
Values can be empty or string or numeric
data have;
informat id v1-v12 $8.;
input id v1-v12;
cards;
1 1 1 1 3 a b . 4 x . 2 2
2 4 5 h k L L A a 4 6 7 c
3 3 4 5 6 7 3 6 8 8 a w d
4 6 7 2 1 1 2 3 4 5 6 7 s
;
There are data for 12 months.I want to compare them in order.
cats(ifc(v1=v2,'0','1')
cats(ifc(v2=v3,'0','1')
cats(ifc(v3=v4,'0','1')
cats(ifc(v4=v5,'0','1')
cats(ifc(v5=v6,'0','1')
cats(ifc(v6=v7,'0','1')
cats(ifc(v7=v8,'0','1')
cats(ifc(v8=v9,'0','1')
cats(ifc(v9=v10,'0','1')
cats(ifc(v10=v11,'0','1')
cats(ifc(v11=v12,'0','1')
At the end, a concatenated string will be formed.
For id = 1 concat_compare='0;0;1;1;1;1;1;1;1;1;0' and
same_count=7 /* I have 3 item for 1, 2 item for 2 and 2 item for missing ,sum(3,2,2) */
not_same_count=5
same_values=1;2; .; /*distinct values, Although it is not the same but more than one / If empty record exists it must be listed */
not_same_values=3;a;b;4;x /*distinct values only have one */
I found a code to find the number of the same on the forum.
data want (drop=i j);
set have;
array vars $8. v1-v12;
same=0;
do i=1 to dim(vars)-1;
do j=i+1 to dim(vars);
same=sum (same, vars(i) eq vars(j));
end;
end;
run;
But I could not do the others.
I need the following fields on a record-based.
concat_compare same_count not_same_count same_values not_same_values
Is it possible to do this for each record/id?
Best regards
Well, your concat problem is simple and well explained, so code below. For your other four variables however, I didn't follow at all - please clarify.
data have; informat id v1-v12 $8.; input id v1-v12; cards; 1 1 1 1 3 a b . 4 x . 2 2 2 4 5 h k L L A a 4 6 7 c 3 3 4 5 6 7 3 6 8 8 a w d 4 6 7 2 1 1 2 3 4 5 6 7 s ; run; data want; set have; length concat_compare $2000; array vals{*} v:; do i=1 to dim(vals)-1; concat_compare=catx(';',concat_compare,ifc(vals{i}=vals{i+1},"1","0")); end; run;
Well, your concat problem is simple and well explained, so code below. For your other four variables however, I didn't follow at all - please clarify.
data have; informat id v1-v12 $8.; input id v1-v12; cards; 1 1 1 1 3 a b . 4 x . 2 2 2 4 5 h k L L A a 4 6 7 c 3 3 4 5 6 7 3 6 8 8 a w d 4 6 7 2 1 1 2 3 4 5 6 7 s ; run; data want; set have; length concat_compare $2000; array vals{*} v:; do i=1 to dim(vals)-1; concat_compare=catx(';',concat_compare,ifc(vals{i}=vals{i+1},"1","0")); end; run;
Hi RW9;
I made the mistake of including id in my first post.
thanks for your reply.I will try to explain the other four variables.
mean of same_values :It is a list of multiple recurring values when we accept each record as an array. so for id=1 ,
The same_values will be "1;2; . "
same_values="1;2; . " (missing is include)
mean of not_same_values:A list of values that do not repeat more than once when we accept each record as an array.
so for id=1 , The not_same_values will be "3;a;b;4;x"
not_same_values="3;a;b;4;x"
same_count:The number of times that multiple repeating values are repeated. It is the sum of the number of repeats
for example ;
for id = 1 , Repeated 1 to 3 times,Repeated missing value(.) to 2 times Repeated 2 to 2 times and Their sum is 7 (3+2+2)
same_count=7
not_same_count:The number of variables that do not repeat more than once
for id = 1, Repeated 3 to 1 times,
Repeated a to 1 times,
Repeated b to 1 times,
Repeated 4 to 1 times,
Repeated x to 1 times
and Their sum is 5 (1+1+1+1+1)
not_same_count=5
I hope,It is more descriptive.
Best regards
for Id 1 V3=1 and v4=3. So wouldn't the result be
0;0;1;1;1;1;1;1;1;1;0
and not your example
0;0;0;1;1;1;1;1;1;1;0
Yes, Yes, I made a mistake.Just like you wrote it.
0;0;1;1;1;1;1;1;1;1;0
thanks
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.