SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
Ronein
Meteorite | Level 14

Hello,

I would like to ask please  a question.

In my source data set there are 10 binary variables (each varaible get value 1 or 0 ).

Each variable has a meaning of another overide businss rule.

I want to create a matrix report(10x10) with information of the correlation (assosiation) between the binary variables.

The target is to see which variables (Overide rules) are working together and which are not working together.

May anyone send an example how to do it .

Let's say that the user define the varaibles list in a macro varaiable.

%let List= Z1+Z2+Z3+Z4+Z5+Z6+Z7+Z8+Z9+Z10;

 

6 REPLIES 6
PaigeMiller
Diamond | Level 26

You probably want the Phi coefficient, which is the equivalent of the Pearson correlation for binary variables. This can be computed in PROC FREQ for any pair of variables, and so if you really want a 10x10 table, you'd have to write a macro and loop over all possible pairs of variables.

--
Paige Miller
Ksharp
Super User
Paige,
That is chisquare test .
And Phi Coefficient / Contingency Coefficient measure the association between two variables NOT correlation .
PaigeMiller
Diamond | Level 26

In the case of two binary variables, I am not aware of any difference between "association" and "correlation".

--
Paige Miller
Tom
Super User Tom
Super User

Show some example data.

Or we could use the example data in Rick Wiklen's blog.  https://blogs.sas.com/content/iml/2019/09/05/cosine-similarity-recommendations.html

 

data recipes;
   input Recipe $ 1-20
      (Tomato Garlic Salt Onion TomatoPaste OliveOil Celery Broth 
       GreenPepper Cumin Flour BrownSugar BayLeaf GroundBeef 
       BlackPepper ChiliPowder Cilantro Carrot CayennePepper Oregano 
       Oil Parsley PorkSausage RedPepper Paprika Thyme Tomatillo 
       JalapenoPepper WorcestershireSauce Lime
       Eggplant GreenOlives Capers Sugar) (1.);
datalines;
Spag Sauce          1111110000000000000101000000000000
Spag Meat Sauce     1111111010001100010000110000000000
Eggplant Relish     0111110000000000000000000000001111
Creole Sauce        1011111110000010000000001100100000
Salsa               1111000000000000100000000011010000
Enchilada Sauce     1000000101110001001010000000000000
;

To compare the COLUMNS instead of the ROWS just transpose the data first.

proc transpose data=recipes out=ingrediants name=ingrediant ;
  id recipe ;
  var _numeric_;
run;
proc distance data=ingrediants out=cos_ingrediants method=COSINE shape=square;
   var ratio(_NUMERIC_);
   id ingrediant;
run;

So in your case if you make your macro variable contain a space delimited list of variable names (instead of + delimited).  You might even be able to use a variable list syntax.

Then the code is something like:

%let List= Z1-Z10;
proc transpose data=HAVE out=columns name=COLUMN;
  var &list;
run;
proc distance data=columns out=cos method=COSINE shape=square;
   var ratio(_NUMERIC_);
   id COLUMN;
run;
proc print data=cos;
run;

So if I use RECIPES as the HAVE and set LIST= Tomato Garlic Salt Onion then the result is:

Obs    COLUMN     Tomato     Garlic      Salt      Onion

 1     Tomato    1.00000    0.67082    0.80000    0.80000
 2     Garlic    0.67082    1.00000    0.89443    0.89443
 3     Salt      0.80000    0.89443    1.00000    1.00000
 4     Onion     0.80000    0.89443    1.00000    1.00000


 

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 3858 views
  • 5 likes
  • 5 in conversation