DATA Step, Macro, Functions and more

Creating a frequency matrix/table

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 10
Accepted Solution

Creating a frequency matrix/table

Hello everyone,

 

I have a dataset that looks pretty much like this:

data have;
infile datalines dsd;
input word $ article ;
datalines;
word1, 123456
word2, 123456
word3, 123456
word4, 123456
word5, 123456
word1, 234567
word2, 234567
word3, 234567
word5, 234567
word1, 345678
word3, 345678
word6, 345678
;

Now, I want to know which word appears with another word in how many articles. For example word1 and word2 appear 2 times together, word1 and word3 3 times.

 

	word1	word2	word3	word4	word5	word6
  word1   3	2	3	1	2	1
  word2   2	2	2	1	2	0
  word3   3	3	3	1	2	1	

 

The output I want to generate should look like this or similar, you get the idea Smiley Happy

I have tried this for a long time now, but everything I managed just resulted in error messages.

 

Is there some easy way to do it? Or do you have an idea how I could approach the problem?

 

Thanks a lot in advance.


Accepted Solutions
Solution
‎06-21-2016 04:20 AM
Super User
Posts: 9,687

Re: Creating a frequency matrix/table

[ Edited ]

If you want symmetry Matrix :

 

 

data have;
infile datalines dsd;
input word $ article ;
datalines;
word1, 123456
word2, 123456
word3, 123456
word4, 123456
word5, 123456
word1, 234567
word2, 234567
word3, 234567
word5, 234567
word1, 345678
word3, 345678
word6, 345678
;
run;
data temp;
array x{9999} $ 50 _temporary_;
length a b $ 50;

do i=1 by 1 until(last.article);
 set have;
 by article;
 x{i}=word;
end;

do m=1 to i-1;
 do n=m+1 to i;
  a=x{m};b=x{n};output;
  a=x{n};b=x{m};output;
 end;
end;
keep a b;
run;
proc sql;
create table levels as
 select a.word as a length=50,b.word as b length=50
 from 
 (select distinct word from have) as a,
 (select distinct word from have) as b
  order by 1,2;
quit;
proc tabulate data=temp classdata=levels;
class a b;
table a='',b=''*n=''/misstext='0';
run;

View solution in original post


All Replies
Solution
‎06-21-2016 04:20 AM
Super User
Posts: 9,687

Re: Creating a frequency matrix/table

[ Edited ]

If you want symmetry Matrix :

 

 

data have;
infile datalines dsd;
input word $ article ;
datalines;
word1, 123456
word2, 123456
word3, 123456
word4, 123456
word5, 123456
word1, 234567
word2, 234567
word3, 234567
word5, 234567
word1, 345678
word3, 345678
word6, 345678
;
run;
data temp;
array x{9999} $ 50 _temporary_;
length a b $ 50;

do i=1 by 1 until(last.article);
 set have;
 by article;
 x{i}=word;
end;

do m=1 to i-1;
 do n=m+1 to i;
  a=x{m};b=x{n};output;
  a=x{n};b=x{m};output;
 end;
end;
keep a b;
run;
proc sql;
create table levels as
 select a.word as a length=50,b.word as b length=50
 from 
 (select distinct word from have) as a,
 (select distinct word from have) as b
  order by 1,2;
quit;
proc tabulate data=temp classdata=levels;
class a b;
table a='',b=''*n=''/misstext='0';
run;
Occasional Contributor
Posts: 10

Re: Creating a frequency matrix/table

works perfectly, thank you!
Super User
Super User
Posts: 7,413

Re: Creating a frequency matrix/table

Hi,

 

Odd request.  Not sure what you want to do when words are missing.  This bit of code gets you the combinations of words from the list, maybe elaborate a bit more on the problem if this doesn't get you there.

data have;
  infile datalines dsd;
  input word $ article ;
datalines;
word1, 123456
word2, 123456
word3, 123456
word4, 123456
word5, 123456
word1, 234567
word2, 234567
word3, 234567
word5, 234567
word1, 345678
word3, 345678
word6, 345678
;
run;

proc transpose data=have out=t_have;
  by article;
  var word;
  id word;
run;

data want (keep=combination);
  set t_have (drop=article _name_);
  array word{*} word:;
  do i=1 to dim(word)-1;
    combination=catx('/',word{i},word{i+1});
    output;
  end;
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 225 views
  • 0 likes
  • 3 in conversation