Hello,
I have the following dataset.
I need to find out how many times does "a"(or b,c,d,e,f) appear jointly in var_1, var_2 and var_3.
Please advise.
data temp;
input _ID var_1 $ var_2 $ var_3 $;
datalines;
1 a b c
2 d e f
3 b d e
4 c d b
5 e f a
6 a c f
7 b d e
8 c a d
9 d b c
10 e f b
11 f e a
12 a b c
13 d e f
14 b d e
15 c d b
16 e f a
17 a c f
18 b d e
19 c a d
20 d b c
21 e f b
22 f e a
23 a d b
24 b e c
25 c f d
;
run;
It's possible you need to explain more about what is in the data and what you want. Looking at the post, I'm going to assume that each observation never repeats the same letter. In that case:
proc freq data=have noprint;
tables var_1 / out=var1_counts (rename=(var_1=value) drop=percent);
tables var_2 / out=var2_counts(rename=(var_2=value) drop=percent);
tables var_3 / out=var3_counts (rename=(var_3=value) drop=percent);
run;
data want;
merge var1_counts (rename=(count=count1))
var2_counts (rename=(count=count2))
var3_counts (rename=(count=count3));
by value;
total_count = sum(count1, count2, count3);
run;
It's untested code at this point, so you'll need to test it to see how close it comes.
It's possible you need to explain more about what is in the data and what you want. Looking at the post, I'm going to assume that each observation never repeats the same letter. In that case:
proc freq data=have noprint;
tables var_1 / out=var1_counts (rename=(var_1=value) drop=percent);
tables var_2 / out=var2_counts(rename=(var_2=value) drop=percent);
tables var_3 / out=var3_counts (rename=(var_3=value) drop=percent);
run;
data want;
merge var1_counts (rename=(count=count1))
var2_counts (rename=(count=count2))
var3_counts (rename=(count=count3));
by value;
total_count = sum(count1, count2, count3);
run;
It's untested code at this point, so you'll need to test it to see how close it comes.
Thank You @Astounding.
I totally see your point here. I am wondering if there is any alternate way to doing these frequencies together rather than treating them separately and then joining them.
The output is what i want though. So thank you again.
@david27 wrote:
Hello,
I have the following dataset.
I need to find out how many times does "a"(or b,c,d,e,f) appear jointly in var_1, var_2 and var_3.
How many times does an "a" appear jointly?
Meaning if var_1='a' and var_2='a' and var_3='a' then this counts as 'a' appears one time?
data temp;
input _ID var_1 $ var_2 $ var_3 $;
jointly=cats(var_1,var_2,var_3);
count_a=find(jointly,'a')>0;
count_b=find(jointly,'b')>0;
count_c=find(jointly,'c')>0;
count_d=find(jointly,'d')>0;
count_e=find(jointly,'e')>0;
count_f=find(jointly,'f')>0;
datalines;
1 a b c
2 d e f
3 b d e
4 c d b
5 e f a
6 a c f
7 b d e
8 c a d
9 d b c
10 e f b
11 f e a
12 a b c
13 d e f
14 b d e
15 c d b
16 e f a
17 a c f
18 b d e
19 c a d
20 d b c
21 e f b
22 f e a
23 a d b
24 b e c
25 c f d
;
run;
proc freq data=temp;
tables count_a--count_f;
run;
Can you clearly define what "jointly" means in this context?
Also if your actual values are more complex than single letters now is the time to provide such data. There are solutions that may work for single letters that may not work for more complex string values.
"jointly" meaning how many times does value "a" appear in var_1 and Var_2 and Var_3.
I don't mean to concatenate any variables.
I would transpose the data and then do a PROC FREQ. It's more dynamic so it doesn't matter how many columns or values you have.
data have;
input _ID var_1 $ var_2 $ var_3 $;
datalines;
1 a b c
2 d e f
3 b d e
4 c d b
5 e f a
6 a c f
7 b d e
8 c a d
9 d b c
10 e f b
11 f e a
12 a b c
13 d e f
14 b d e
15 c d b
16 e f a
17 a c f
18 b d e
19 c a d
20 d b c
21 e f b
22 f e a
23 a d b
24 b e c
25 c f d
;
run;
proc transpose data=have out=long;
by _ID;
var var_1-var_3;
run;
proc freq data=long;
table col1*_name_ / out=want;
run;
@david27 wrote:
Hello,
I have the following dataset.
I need to find out how many times does "a"(or b,c,d,e,f) appear jointly in var_1, var_2 and var_3.
Please advise.
data temp; input _ID var_1 $ var_2 $ var_3 $; datalines; 1 a b c 2 d e f 3 b d e 4 c d b 5 e f a 6 a c f 7 b d e 8 c a d 9 d b c 10 e f b 11 f e a 12 a b c 13 d e f 14 b d e 15 c d b 16 e f a 17 a c f 18 b d e 19 c a d 20 d b c 21 e f b 22 f e a 23 a d b 24 b e c 25 c f d ; run;
Hi @david27
proc transpose data=temp out=w;
by _id;
var var_:;
run;
proc sql;
create table w1 as
select col1, count(col1) as c
from w
group by _name_,col1
order by col1;
quit;
data want;
do _n_=1 by 1 until(last.col1);
set w1;
by col1;
array var_count(3);
var_count(_n_)=c;
end;
total_count=sum(of var_count(*));
drop c;
run;
It is easy for SAS/IML .
data temp;
input _ID var_1 $ var_2 $ var_3 $;
datalines;
1 a b c
2 d e f
3 b d e
4 c d b
5 e f a
6 a c f
7 b d e
8 c a d
9 d b c
10 e f b
11 f e a
12 a b c
13 d e f
14 b d e
15 c d b
16 e f a
17 a c f
18 b d e
19 c a d
20 d b c
21 e f b
22 f e a
23 a d b
24 b e c
25 c f d
;
run;
proc iml;
use temp(keep=var_:);
read all var _char_ into x;
call tabulate(level,freq,x);
create want var {level freq};
append;
close;
quit;
proc print;run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.