## Creating a frequency matrix/table

Solved
Occasional Contributor
Posts: 10

# Creating a frequency matrix/table

Hello everyone,

I have a dataset that looks pretty much like this:

``````data have;
infile datalines dsd;
input word \$ article ;
datalines;
word1, 123456
word2, 123456
word3, 123456
word4, 123456
word5, 123456
word1, 234567
word2, 234567
word3, 234567
word5, 234567
word1, 345678
word3, 345678
word6, 345678
;``````

Now, I want to know which word appears with another word in how many articles. For example word1 and word2 appear 2 times together, word1 and word3 3 times.

``````	word1	word2	word3	word4	word5	word6
word1   3	2	3	1	2	1
word2   2	2	2	1	2	0
word3   3	3	3	1	2	1	``````

The output I want to generate should look like this or similar, you get the idea

I have tried this for a long time now, but everything I managed just resulted in error messages.

Is there some easy way to do it? Or do you have an idea how I could approach the problem?

Accepted Solutions
Solution
‎06-21-2016 04:20 AM
Super User
Posts: 10,778

## Re: Creating a frequency matrix/table

[ Edited ]

If you want symmetry Matrix :

``````data have;
infile datalines dsd;
input word \$ article ;
datalines;
word1, 123456
word2, 123456
word3, 123456
word4, 123456
word5, 123456
word1, 234567
word2, 234567
word3, 234567
word5, 234567
word1, 345678
word3, 345678
word6, 345678
;
run;
data temp;
array x{9999} \$ 50 _temporary_;
length a b \$ 50;

do i=1 by 1 until(last.article);
set have;
by article;
x{i}=word;
end;

do m=1 to i-1;
do n=m+1 to i;
a=x{m};b=x{n};output;
a=x{n};b=x{m};output;
end;
end;
keep a b;
run;
proc sql;
create table levels as
select a.word as a length=50,b.word as b length=50
from
(select distinct word from have) as a,
(select distinct word from have) as b
order by 1,2;
quit;
proc tabulate data=temp classdata=levels;
class a b;
table a='',b=''*n=''/misstext='0';
run;``````

All Replies
Solution
‎06-21-2016 04:20 AM
Super User
Posts: 10,778

## Re: Creating a frequency matrix/table

[ Edited ]

If you want symmetry Matrix :

``````data have;
infile datalines dsd;
input word \$ article ;
datalines;
word1, 123456
word2, 123456
word3, 123456
word4, 123456
word5, 123456
word1, 234567
word2, 234567
word3, 234567
word5, 234567
word1, 345678
word3, 345678
word6, 345678
;
run;
data temp;
array x{9999} \$ 50 _temporary_;
length a b \$ 50;

do i=1 by 1 until(last.article);
set have;
by article;
x{i}=word;
end;

do m=1 to i-1;
do n=m+1 to i;
a=x{m};b=x{n};output;
a=x{n};b=x{m};output;
end;
end;
keep a b;
run;
proc sql;
create table levels as
select a.word as a length=50,b.word as b length=50
from
(select distinct word from have) as a,
(select distinct word from have) as b
order by 1,2;
quit;
proc tabulate data=temp classdata=levels;
class a b;
table a='',b=''*n=''/misstext='0';
run;``````
Occasional Contributor
Posts: 10

## Re: Creating a frequency matrix/table

works perfectly, thank you!
Super User
Posts: 9,599

## Re: Creating a frequency matrix/table

Hi,

Odd request.  Not sure what you want to do when words are missing.  This bit of code gets you the combinations of words from the list, maybe elaborate a bit more on the problem if this doesn't get you there.

```data have;
infile datalines dsd;
input word \$ article ;
datalines;
word1, 123456
word2, 123456
word3, 123456
word4, 123456
word5, 123456
word1, 234567
word2, 234567
word3, 234567
word5, 234567
word1, 345678
word3, 345678
word6, 345678
;
run;

proc transpose data=have out=t_have;
by article;
var word;
id word;
run;

data want (keep=combination);
set t_have (drop=article _name_);
array word{*} word:;
do i=1 to dim(word)-1;
combination=catx('/',word{i},word{i+1});
output;
end;
run;```
🔒 This topic is solved and locked.