<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Correlation Analysis of Multiple Binary Variables in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Correlation-Analysis-of-Multiple-Binary-Variables/m-p/47146#M12628</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;PLEASE Help!&lt;/P&gt;&lt;P&gt;I have a dataset of customers who are flagged with whether they have viewed a film or not. Each row represents a customer. Each column represents a film, where a 0 or 1 states whether they have seen that film or not. I need to work out the correlation between these films i.e. if people watch X they are also most likely to have watched Y. A simple way would be to cross tab all variables so that I create a dataset with all films as a row and all films as a column, with the count of customers who had seen each film pairing. I don't know how to do this so I thought if my variables were continuous, a standard PROC CORR would give me all the information I need in matrix form i.e. the correlation coefficient and the frequency of customers. But how do I create this when my variables are binary? Or can anyone help me with creating the count matrix?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Wed, 28 Mar 2012 16:04:19 GMT</pubDate>
    <dc:creator>AmandaEHS</dc:creator>
    <dc:date>2012-03-28T16:04:19Z</dc:date>
    <item>
      <title>Correlation Analysis of Multiple Binary Variables</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Correlation-Analysis-of-Multiple-Binary-Variables/m-p/47146#M12628</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;PLEASE Help!&lt;/P&gt;&lt;P&gt;I have a dataset of customers who are flagged with whether they have viewed a film or not. Each row represents a customer. Each column represents a film, where a 0 or 1 states whether they have seen that film or not. I need to work out the correlation between these films i.e. if people watch X they are also most likely to have watched Y. A simple way would be to cross tab all variables so that I create a dataset with all films as a row and all films as a column, with the count of customers who had seen each film pairing. I don't know how to do this so I thought if my variables were continuous, a standard PROC CORR would give me all the information I need in matrix form i.e. the correlation coefficient and the frequency of customers. But how do I create this when my variables are binary? Or can anyone help me with creating the count matrix?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 28 Mar 2012 16:04:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Correlation-Analysis-of-Multiple-Binary-Variables/m-p/47146#M12628</guid>
      <dc:creator>AmandaEHS</dc:creator>
      <dc:date>2012-03-28T16:04:19Z</dc:date>
    </item>
    <item>
      <title>Correlation Analysis of Multiple Binary Variables</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Correlation-Analysis-of-Multiple-Binary-Variables/m-p/47147#M12629</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt; You want to calculate a phi coefficient as a measure of association for binary data.&amp;nbsp; If you run the usual Pearson correlation in Proc Corr on binary data, the measure you get will be the phi coefficient, as they are equivalent.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 28 Mar 2012 17:11:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Correlation-Analysis-of-Multiple-Binary-Variables/m-p/47147#M12629</guid>
      <dc:creator>mfisher</dc:creator>
      <dc:date>2012-03-28T17:11:10Z</dc:date>
    </item>
    <item>
      <title>Correlation Analysis of Multiple Binary Variables</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Correlation-Analysis-of-Multiple-Binary-Variables/m-p/47148#M12630</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I don't think you have a good idea here. and also don't need to make a new dataset.&lt;/P&gt;&lt;P&gt;proc corr only consider the correlation of&amp;nbsp; two variables which has bivariate normal distribution ,it&amp;nbsp; wouldn't consider the influence of other variable(i.e. other film ) to a variable ----- which are named partial correlation.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I think you should use Cluster Analysis to see which film belong to which cluster.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I highly recommend you to post it at SAS Statistical Procedure , where has some experts such as Steve, lvm, Rick, they are all seasoned statistician . Thay can give you a whole explanation about your question.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Ksharp&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 29 Mar 2012 03:14:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Correlation-Analysis-of-Multiple-Binary-Variables/m-p/47148#M12630</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2012-03-29T03:14:39Z</dc:date>
    </item>
  </channel>
</rss>

