<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Joint probabilities in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Joint-probabilities/m-p/15658#M2081</link>
    <description>I was wondering if anyone knows a solution to the following problem.&lt;BR /&gt;
&lt;BR /&gt;
I've got an example dataset which can be viewed in two ways, it really depends on how best to view it for solving my problem.&lt;BR /&gt;
&lt;BR /&gt;
The first version is:&lt;BR /&gt;
&lt;BR /&gt;
PatientID Condition   Present&lt;BR /&gt;
1              Cough        1&lt;BR /&gt;
1              Headache   1&lt;BR /&gt;
1              Vomiting     0&lt;BR /&gt;
2              Cough        0&lt;BR /&gt;
2              Headache  1&lt;BR /&gt;
2              Vomiting     1&lt;BR /&gt;
3              Cough        0&lt;BR /&gt;
3              Headache  0&lt;BR /&gt;
3              Vomiting    0&lt;BR /&gt;
4              Cough       1&lt;BR /&gt;
4              Headache  1&lt;BR /&gt;
4              Vomiting     0&lt;BR /&gt;
&lt;BR /&gt;
The second version:&lt;BR /&gt;
&lt;BR /&gt;
Patient ID Cough Headache Vomiting&lt;BR /&gt;
1              1          1               0&lt;BR /&gt;
2              0          1               0&lt;BR /&gt;
3              0          0               0&lt;BR /&gt;
4              1          1               0&lt;BR /&gt;
&lt;BR /&gt;
I would like to find the probability of a) having each of the conditions and b) the joint probability of having say cough and headache.  I can find a) fairly simply but I'm having problems figuring out how best to find b) and which version of the example dataset I should use.  If I have much larger version of my example datasets, I would like to be able to write a program which can move through each of the conditions to find the joint probability.  Any suggestions would be most appreciated.</description>
    <pubDate>Tue, 12 Oct 2010 11:09:26 GMT</pubDate>
    <dc:creator>den</dc:creator>
    <dc:date>2010-10-12T11:09:26Z</dc:date>
    <item>
      <title>Joint probabilities</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Joint-probabilities/m-p/15658#M2081</link>
      <description>I was wondering if anyone knows a solution to the following problem.&lt;BR /&gt;
&lt;BR /&gt;
I've got an example dataset which can be viewed in two ways, it really depends on how best to view it for solving my problem.&lt;BR /&gt;
&lt;BR /&gt;
The first version is:&lt;BR /&gt;
&lt;BR /&gt;
PatientID Condition   Present&lt;BR /&gt;
1              Cough        1&lt;BR /&gt;
1              Headache   1&lt;BR /&gt;
1              Vomiting     0&lt;BR /&gt;
2              Cough        0&lt;BR /&gt;
2              Headache  1&lt;BR /&gt;
2              Vomiting     1&lt;BR /&gt;
3              Cough        0&lt;BR /&gt;
3              Headache  0&lt;BR /&gt;
3              Vomiting    0&lt;BR /&gt;
4              Cough       1&lt;BR /&gt;
4              Headache  1&lt;BR /&gt;
4              Vomiting     0&lt;BR /&gt;
&lt;BR /&gt;
The second version:&lt;BR /&gt;
&lt;BR /&gt;
Patient ID Cough Headache Vomiting&lt;BR /&gt;
1              1          1               0&lt;BR /&gt;
2              0          1               0&lt;BR /&gt;
3              0          0               0&lt;BR /&gt;
4              1          1               0&lt;BR /&gt;
&lt;BR /&gt;
I would like to find the probability of a) having each of the conditions and b) the joint probability of having say cough and headache.  I can find a) fairly simply but I'm having problems figuring out how best to find b) and which version of the example dataset I should use.  If I have much larger version of my example datasets, I would like to be able to write a program which can move through each of the conditions to find the joint probability.  Any suggestions would be most appreciated.</description>
      <pubDate>Tue, 12 Oct 2010 11:09:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Joint-probabilities/m-p/15658#M2081</guid>
      <dc:creator>den</dc:creator>
      <dc:date>2010-10-12T11:09:26Z</dc:date>
    </item>
    <item>
      <title>Re: Joint probabilities</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Joint-probabilities/m-p/15659#M2082</link>
      <description>Just to make sure that I understand, do you want to estimate the joint probability from the given data?  In the example you gave, for example, the estimated probability of having a cough and headache would be 2/4=0.5, right?&lt;BR /&gt;
If I correctly interpreted your problem, I think that the easiest method would use version 2 of your dataset.  Assuming that the variables cough and headache are both 0/1 indicator variables and there is exactly one observation for each patient, the following code should work:&lt;BR /&gt;
[pre]&lt;BR /&gt;
data test;&lt;BR /&gt;
input id $ cough headache vomiting;&lt;BR /&gt;
datalines;&lt;BR /&gt;
1 1 1 0&lt;BR /&gt;
2 0 1 0&lt;BR /&gt;
3 0 0 0&lt;BR /&gt;
4 1 1 0&lt;BR /&gt;
;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
proc sql;&lt;BR /&gt;
	select sum(cough and headache)/count(*) as prob_cough_and_headache&lt;BR /&gt;
	from test;&lt;BR /&gt;
quit;&lt;BR /&gt;
[/pre]</description>
      <pubDate>Tue, 12 Oct 2010 12:24:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Joint-probabilities/m-p/15659#M2082</guid>
      <dc:creator>polingjw</dc:creator>
      <dc:date>2010-10-12T12:24:12Z</dc:date>
    </item>
    <item>
      <title>Re: Joint probabilities</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Joint-probabilities/m-p/15660#M2083</link>
      <description>There might be a better, more efficient way of doing this, but here is my initial shot at a program which moves through all the combinations to find joint probabilities: &lt;BR /&gt;
&lt;BR /&gt;
[pre]&lt;BR /&gt;
data test;&lt;BR /&gt;
input id $ cough headache vomiting;&lt;BR /&gt;
datalines;&lt;BR /&gt;
1 1 1 0&lt;BR /&gt;
2 0 1 0&lt;BR /&gt;
3 0 0 0&lt;BR /&gt;
4 1 1 0&lt;BR /&gt;
;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
proc sql noprint;&lt;BR /&gt;
	select distinct name into: conditions separated by '" "'&lt;BR /&gt;
	from sashelp.vcolumn&lt;BR /&gt;
	where libname = 'WORK' and memname = 'TEST' and upcase(name) ne 'ID';&lt;BR /&gt;
&lt;BR /&gt;
	select count(*) into: num_conditions&lt;BR /&gt;
	from sashelp.vcolumn&lt;BR /&gt;
	where libname = 'WORK' and memname = 'TEST' and upcase(name) ne 'ID';&lt;BR /&gt;
quit;&lt;BR /&gt;
&lt;BR /&gt;
data temp(drop=n ncomb j rc);&lt;BR /&gt;
   array condition[&amp;amp;num_conditions] $20 ("&amp;amp;conditions");&lt;BR /&gt;
   n=dim(condition);&lt;BR /&gt;
   do k=1 to dim(condition);&lt;BR /&gt;
   	ncomb=comb(n,k);&lt;BR /&gt;
    do j=1 to ncomb+1;&lt;BR /&gt;
      	rc=lexcomb(j, k, of condition&lt;LI&gt;);&lt;BR /&gt;
      	if rc&amp;lt;0 then leave;&lt;BR /&gt;
		output;&lt;BR /&gt;
   	end;&lt;BR /&gt;
  end;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
data _null_;&lt;BR /&gt;
	length combinations $ 200;&lt;BR /&gt;
	set temp end=last;&lt;BR /&gt;
	array condition{*} condition:;&lt;BR /&gt;
	do j=1 to dim(condition);&lt;BR /&gt;
		if j&amp;gt;k then condition{j} = ' ';&lt;BR /&gt;
	end;&lt;BR /&gt;
	combinations=catx(' and ', of condition:);&lt;BR /&gt;
	i+1; &lt;BR /&gt;
	call symput(cats("condition",i), combinations);&lt;BR /&gt;
	if last then call symput('nobs', i);&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
data probabilities;&lt;BR /&gt;
	length conditions $ 200 prob 8;&lt;BR /&gt;
	if 0 then output;;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
%macro probabilities;&lt;BR /&gt;
proc sql;&lt;BR /&gt;
	%do i=1 %to &amp;amp;nobs;&lt;BR /&gt;
	insert into probabilities(conditions, prob)&lt;BR /&gt;
	select "&amp;amp;&amp;amp;&amp;amp;condition&amp;amp;i", sum(&amp;amp;&amp;amp;&amp;amp;condition&amp;amp;i)/count(*)&lt;BR /&gt;
	from test; &lt;BR /&gt;
&lt;BR /&gt;
	%end;&lt;BR /&gt;
quit;&lt;BR /&gt;
%mend;&lt;BR /&gt;
%probabilities&lt;BR /&gt;
&lt;BR /&gt;
proc print data=probabilities;&lt;BR /&gt;
run;&lt;BR /&gt;
[/pre]&lt;BR /&gt;
&lt;BR /&gt;
Message was edited by: polingjw&lt;BR /&gt;
&lt;BR /&gt;
Modified array declaration in the data _null_ step.  Previously used condition1-condition3.&lt;/LI&gt;</description>
      <pubDate>Tue, 12 Oct 2010 13:41:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Joint-probabilities/m-p/15660#M2083</guid>
      <dc:creator>polingjw</dc:creator>
      <dc:date>2010-10-12T13:41:41Z</dc:date>
    </item>
    <item>
      <title>Re: Joint probabilities</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Joint-probabilities/m-p/15661#M2084</link>
      <description>To polingjw, in response to your first reply, you are correct in thinking that the joint probability for cought and headache is 1/2.  Many thanks for both your suggestions, I'm going to try both.  The latter might be more appropriate as I want to be able to find joint probabilities for much larger datasets and need an efficient way doing this.

This worked very well.  However, I have some further questions.  Ideally I would like to separate the joint probabilites from the probabilities for each condition as I would like to then calculate the conditional probability.  I have been trying in vain to adapt the code which polingjw wrote to do this but to no avail.  Any advice would be much appreciated.&lt;BR /&gt;
&lt;BR /&gt;
    &lt;BR /&gt;
Message was edited by: den</description>
      <pubDate>Tue, 12 Oct 2010 13:54:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Joint-probabilities/m-p/15661#M2084</guid>
      <dc:creator>den</dc:creator>
      <dc:date>2010-10-12T13:54:14Z</dc:date>
    </item>
    <item>
      <title>Re: Joint probabilities</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Joint-probabilities/m-p/15662#M2085</link>
      <description>I don’t think that I completely understand what you’re trying to do.  Specifically, what conditional probabilities are you trying to estimate?  In general, the conditional events could be placed into the proc sql statement in a where clause.  Here is an example of a macro that might be used to estimate some conditional probabilities:&lt;BR /&gt;
&lt;BR /&gt;
[pre]&lt;BR /&gt;
data test;&lt;BR /&gt;
input id $ cough headache vomiting fever;&lt;BR /&gt;
datalines;&lt;BR /&gt;
1 1 0 1 1&lt;BR /&gt;
2 0 1 0 0&lt;BR /&gt;
3 0 0 0 1&lt;BR /&gt;
4 1 1 0 1&lt;BR /&gt;
;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
%macro findprob(event, condition);&lt;BR /&gt;
proc sql noprint;&lt;BR /&gt;
	select avg(&amp;amp;event) into:prob&lt;BR /&gt;
	from test&lt;BR /&gt;
	%if %bquote(&amp;amp;condition) ne %then where &amp;amp;condition;&lt;BR /&gt;
	;&lt;BR /&gt;
quit;&lt;BR /&gt;
&lt;BR /&gt;
%if %bquote(&amp;amp;condition) ne %then %let condition = %str( )given &amp;amp;condition;&lt;BR /&gt;
%let prob=&amp;amp;prob;&lt;BR /&gt;
%put The probability of &amp;amp;event&amp;amp;condition is &amp;amp;prob..;&lt;BR /&gt;
%mend;&lt;BR /&gt;
&lt;BR /&gt;
options nonotes;&lt;BR /&gt;
&lt;BR /&gt;
%findprob(cough)&lt;BR /&gt;
%findprob(cough, vomiting)&lt;BR /&gt;
%findprob(cough and headache)&lt;BR /&gt;
%findprob(cough, headache and fever)&lt;BR /&gt;
%findprob(cough and fever, not vomiting)&lt;BR /&gt;
%findprob(fever, cough or headache)&lt;BR /&gt;
[/pre]&lt;BR /&gt;
&lt;BR /&gt;
The following appears on the log when the program is run:&lt;BR /&gt;
&lt;BR /&gt;
[pre]&lt;BR /&gt;
396  %findprob(cough)&lt;BR /&gt;
The probability of cough is 0.5.&lt;BR /&gt;
397  %findprob(cough, vomiting)&lt;BR /&gt;
The probability of cough given vomiting is 1.&lt;BR /&gt;
398  %findprob(cough and headache)&lt;BR /&gt;
The probability of cough and headache is 0.25.&lt;BR /&gt;
399  %findprob(cough, headache and fever)&lt;BR /&gt;
The probability of cough given headache and fever is 1.&lt;BR /&gt;
400  %findprob(cough and fever, not vomiting)&lt;BR /&gt;
The probability of cough and fever given not vomiting is 0.333333.&lt;BR /&gt;
401  %findprob(fever, cough or headache)&lt;BR /&gt;
The probability of fever given cough or headache is 0.666667.&lt;BR /&gt;
[/pre]</description>
      <pubDate>Mon, 18 Oct 2010 17:26:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Joint-probabilities/m-p/15662#M2085</guid>
      <dc:creator>polingjw</dc:creator>
      <dc:date>2010-10-18T17:26:02Z</dc:date>
    </item>
  </channel>
</rss>

