Solved: Re: Frequency on each value in multiple variables.

david27 · Posted 01-18-2019 11:21 AM

Hello,

I have the following dataset.

I need to find out how many times does "a"(or b,c,d,e,f) appear jointly in var_1, var_2 and var_3.

Please advise.

data temp;
input _ID var_1 $ var_2 $ var_3 $;
datalines;
1 a b c
2 d e f
3 b d e
4 c d b
5 e f a
6 a c f
7 b d e
8 c a d
9 d b c
10 e f b
11 f e a
12 a b c
13 d e f
14 b d e
15 c d b
16 e f a
17 a c f
18 b d e
19 c a d
20 d b c
21 e f b
22 f e a
23 a d b
24 b e c
25 c f d
;
run;

Astounding · Posted 01-18-2019 11:32 AM

It's possible you need to explain more about what is in the data and what you want. Looking at the post, I'm going to assume that each observation never repeats the same letter. In that case:

proc freq data=have noprint;

tables var_1 / out=var1_counts (rename=(var_1=value) drop=percent);

tables var_2 / out=var2_counts(rename=(var_2=value) drop=percent);

tables var_3 / out=var3_counts (rename=(var_3=value) drop=percent);

run;

data want;

merge var1_counts (rename=(count=count1))

var2_counts (rename=(count=count2))

var3_counts (rename=(count=count3));

by value;

total_count = sum(count1, count2, count3);

run;

It's untested code at this point, so you'll need to test it to see how close it comes.

View solution in original post

Reeza · Posted 01-18-2019 11:26 AM

Show what your expected output would be please.

Astounding · Posted 01-18-2019 11:32 AM

It's possible you need to explain more about what is in the data and what you want. Looking at the post, I'm going to assume that each observation never repeats the same letter. In that case:

proc freq data=have noprint;

tables var_1 / out=var1_counts (rename=(var_1=value) drop=percent);

tables var_2 / out=var2_counts(rename=(var_2=value) drop=percent);

tables var_3 / out=var3_counts (rename=(var_3=value) drop=percent);

run;

data want;

merge var1_counts (rename=(count=count1))

var2_counts (rename=(count=count2))

var3_counts (rename=(count=count3));

by value;

total_count = sum(count1, count2, count3);

run;

It's untested code at this point, so you'll need to test it to see how close it comes.

david27 · Posted 01-18-2019 12:02 PM

Thank You @Astounding.

I totally see your point here. I am wondering if there is any alternate way to doing these frequencies together rather than treating them separately and then joining them.

The output is what i want though. So thank you again.

PaigeMiller · Posted 01-18-2019 11:32 AM

@david27 wrote:

Hello,

I have the following dataset.

I need to find out how many times does "a"(or b,c,d,e,f) appear jointly in var_1, var_2 and var_3.

How many times does an "a" appear jointly?

Meaning if var_1='a' and var_2='a' and var_3='a' then this counts as 'a' appears one time?

data temp;
input _ID var_1 $ var_2 $ var_3 $;
jointly=cats(var_1,var_2,var_3);
count_a=find(jointly,'a')>0;
count_b=find(jointly,'b')>0;
count_c=find(jointly,'c')>0;
count_d=find(jointly,'d')>0;
count_e=find(jointly,'e')>0;
count_f=find(jointly,'f')>0;
datalines;
1 a b c
2 d e f
3 b d e
4 c d b
5 e f a
6 a c f
7 b d e
8 c a d
9 d b c
10 e f b
11 f e a
12 a b c
13 d e f
14 b d e
15 c d b
16 e f a
17 a c f
18 b d e
19 c a d
20 d b c
21 e f b
22 f e a
23 a d b
24 b e c
25 c f d
;
run;

proc freq data=temp;
	tables count_a--count_f;
run;

--
Paige Miller

ballardw · Posted 01-18-2019 11:40 AM

Can you clearly define what "jointly" means in this context?

Also if your actual values are more complex than single letters now is the time to provide such data. There are solutions that may work for single letters that may not work for more complex string values.

david27 · Posted 01-18-2019 12:00 PM

"jointly" meaning how many times does value "a" appear in var_1 and Var_2 and Var_3.

I don't mean to concatenate any variables.

Reeza · Posted 01-18-2019 12:10 PM

I would transpose the data and then do a PROC FREQ. It's more dynamic so it doesn't matter how many columns or values you have.

data have;
input _ID var_1 $ var_2 $ var_3 $;
datalines;
1 a b c
2 d e f
3 b d e
4 c d b
5 e f a
6 a c f
7 b d e
8 c a d
9 d b c
10 e f b
11 f e a
12 a b c
13 d e f
14 b d e
15 c d b
16 e f a
17 a c f
18 b d e
19 c a d
20 d b c
21 e f b
22 f e a
23 a d b
24 b e c
25 c f d
;
run;
proc transpose data=have out=long;
by _ID;
var var_1-var_3;
run;

proc freq data=long;
table col1*_name_ / out=want;
run;

@david27 wrote:

Hello,

I have the following dataset.

I need to find out how many times does "a"(or b,c,d,e,f) appear jointly in var_1, var_2 and var_3.

Please advise.
data temp;
input _ID var_1 $ var_2 $ var_3 $;
datalines;
1 a b c
2 d e f
3 b d e
4 c d b
5 e f a
6 a c f
7 b d e
8 c a d
9 d b c
10 e f b
11 f e a
12 a b c
13 d e f
14 b d e
15 c d b
16 e f a
17 a c f
18 b d e
19 c a d
20 d b c
21 e f b
22 f e a
23 a d b
24 b e c
25 c f d
;
run;

novinosrin · Posted 01-18-2019 12:19 PM

Hi @david27


proc transpose data=temp out=w;
by _id;
var var_:;
run;

proc sql;
create table w1 as
select col1, count(col1) as c
from w
group by _name_,col1
order by col1;
quit;
data want;
do _n_=1 by 1 until(last.col1);
set w1;
by col1;
array var_count(3);
var_count(_n_)=c;
end;
total_count=sum(of var_count(*));
drop c;
run;

Ksharp · Posted 01-19-2019 06:32 AM

It is easy for SAS/IML .

data temp;
input _ID var_1 $ var_2 $ var_3 $;
datalines;
1 a b c
2 d e f
3 b d e
4 c d b
5 e f a
6 a c f
7 b d e
8 c a d
9 d b c
10 e f b
11 f e a
12 a b c
13 d e f
14 b d e
15 c d b
16 e f a
17 a c f
18 b d e
19 c a d
20 d b c
21 e f b
22 f e a
23 a d b
24 b e c
25 c f d
;
run;

proc iml;
use temp(keep=var_:);
read all var _char_ into x;
call tabulate(level,freq,x);
create want var {level freq};
append;
close;
quit;

proc print;run;

Registration is open

SAS Training: Just a Click Away