BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
TimW
Calcite | Level 5

I have a dataset of dummy variables that look like this -

(Both tables the same - some problems viewing as image)

ContactVar1Var2Var3Var4Var5Var6
A111111
B


1
1
C111

1
D


11
E
11


F1



1

Capture1.PNG

What I need is to do a frequency count of all the variable pairings so that I get a table that looks like this -

Capture2.PNG

VariableVar1Var2Var3Var4Var5Var6
Var1
22112
Var22
3112
Var323
112
Var4111
21
Var51112
1
Var622211

Does anyone have any ideas how to achieve this?

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

It is really more difficult than i imaged.

data have;
input contact $ var1-var6;
cards;
a 1 1 1 1 1 1
b . . . 1 . 1
c 1 1 1 . . 1
d . . . 1 1 .
e . 1 1 . . .
f 1 . . . . 1
;
run;
data temp(keep= name1 name2);
 set have;
 length name1 name2 $20 ;
 array v{*} var1-var6;
 do i=1 to dim(v)-1;
  do j=i+1 to dim(v);
   if v{i}=1 and v{j}=1 then do;
                               name1=vname(v{i});
                               name2=vname(v{j});
                               output;
                              end;
  end;
 end;
run;
proc freq data=temp noprint;
 tables name1*name2/out=x(drop=percent) nopercent nocum;
run;
proc sql noprint;
 select quote(trim(name)) into : list separated by ','
  from dictionary.columns
   where libname='WORK' and memname='HAVE' and name like 'var%';
quit;

data xx(keep=name1 name2);
 length name1 name2 $20 ;
 do name1=&list ;
  do name2=&list;
    output;
  end;
 end;
run;
proc sort data=xx;by name1 name2;run;
data t;
 merge xx x;
 by name1 name2;
 _name1=name1;_name2=name2;
run;
data t;
 set t;
 call sortc(_name1,_name2);
run;
data tt(drop=_:);
 if _n_ eq 1 then do;
  if 0 then set t;
  declare hash ha(dataset:'t');
   ha.definekey('_name1','_name2');
   ha.definedata('count');
   ha.definedone();
 end;
 set t;
 count=.;
 rc=ha.find();
run;

proc transpose data=tt out=want(drop=_:);
 by name1;
 id name2;
 var count;
run;

Ksharp

View solution in original post

8 REPLIES 8
manojinpec
Obsidian | Level 7

No able to view your images.

Ksharp
Super User

It is really more difficult than i imaged.

data have;
input contact $ var1-var6;
cards;
a 1 1 1 1 1 1
b . . . 1 . 1
c 1 1 1 . . 1
d . . . 1 1 .
e . 1 1 . . .
f 1 . . . . 1
;
run;
data temp(keep= name1 name2);
 set have;
 length name1 name2 $20 ;
 array v{*} var1-var6;
 do i=1 to dim(v)-1;
  do j=i+1 to dim(v);
   if v{i}=1 and v{j}=1 then do;
                               name1=vname(v{i});
                               name2=vname(v{j});
                               output;
                              end;
  end;
 end;
run;
proc freq data=temp noprint;
 tables name1*name2/out=x(drop=percent) nopercent nocum;
run;
proc sql noprint;
 select quote(trim(name)) into : list separated by ','
  from dictionary.columns
   where libname='WORK' and memname='HAVE' and name like 'var%';
quit;

data xx(keep=name1 name2);
 length name1 name2 $20 ;
 do name1=&list ;
  do name2=&list;
    output;
  end;
 end;
run;
proc sort data=xx;by name1 name2;run;
data t;
 merge xx x;
 by name1 name2;
 _name1=name1;_name2=name2;
run;
data t;
 set t;
 call sortc(_name1,_name2);
run;
data tt(drop=_:);
 if _n_ eq 1 then do;
  if 0 then set t;
  declare hash ha(dataset:'t');
   ha.definekey('_name1','_name2');
   ha.definedata('count');
   ha.definedone();
 end;
 set t;
 count=.;
 rc=ha.find();
run;

proc transpose data=tt out=want(drop=_:);
 by name1;
 id name2;
 var count;
run;

Ksharp

TimW
Calcite | Level 5

Awesome!! And actually for what I need to do I can simply take the table X created at the top and project solved. Cheers!

kuridisanjeev
Quartz | Level 8

Great ksharp....

amazing...

TimW
Calcite | Level 5

He is good!

data_null__
Jade | Level 19

I borrowed your data step to create HAVE.

data have;

   input contact $ var1-var6;

   cards;

a 1 1 1 1 1 1

b . . . 1 . 1

c 1 1 1 . . 1

d . . . 1 1 .

e . 1 1 . . .

f 1 . . . . 1

;;;;

   run;

ods listing close;

ods output list=list(keep=table frequency);

proc freq;

   tables (var1-var6)*(var1-var6) / list;

   run;

ods listing;    

data list;

   set list;

   row=scan(table,2);

   col=scan(table,-1);

   if row eq col then frequency=.;

   run;

proc sort data=list;

   by row col;

   run;

proc transpose out=square(drop=_name_);

   by row;

   id col;

   var frequency;

   run;

proc print;

   run;

proc compare base=want compare=square;

   run;

Obs    row         var1        var2        var3        var4        var5        var6

                                                                                  

1     var1           .           2           2           1           1           3

2     var2           2           .           3           1           1           2

3     var3           2           3           .           1           1           2

4     var4           1           1           1           .           2           2

5     var5           1           1           1           2           .           1

6     var6           3           2           2           2           1           .

Message was edited by: data _null_

Ksharp
Super User

Great NULL.

PGStats
Opal | Level 21

Yet another way:

data have;
input contact $ var1-var6;
cards;
a 1 1 1 1 1 1
b . . . 1 . 1
c 1 1 1 . . 1
d . . . 1 1 .
e . 1 1 . . .
f 1 . . . . 1
;


proc transpose data=have out=havelist(where=(col1 is not missing));
by contact notsorted;
var var:;
run;

proc sql;
create table haveCounts as
select h1._name_ as variable, h2._name_ as col, count(*) as n
from haveList as h1 inner join haveList as h2
on h1.contact=h2.contact and h1._name_ ne h2._name_
group by h1._name_, h2._name_
order by variable;
drop table haveList;
quit;

proc transpose data=haveCounts out=want(drop=_NAME_);
by variable;
var n;
id col;
run;

/* At this point you have the counts table. Now for presentation : */

option missing="";
proc sql;
drop table haveCounts;
select variable label="Variable", var1, var2, var3, var4, var5, var6
from want;
quit;

PG

PG

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1291 views
  • 2 likes
  • 6 in conversation