<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: SAS code for two-part model for healthcare costs in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/SAS-code-for-two-part-model-for-healthcare-costs/m-p/878489#M43435</link>
    <description>&lt;P&gt;See &lt;A href="http://support.sas.com/kb/68202" target="_self"&gt;this note&lt;/A&gt; on modeling continuous response data with many zeros. The Tweedie distribution is commonly used since it can accommodate positive data with many zeros. Also, as mentioned in the note, PROC HPCDM in SAS/ETS fits a compound model to loss data consisting of both probability and amount of loss.&lt;/P&gt;</description>
    <pubDate>Wed, 31 May 2023 16:41:19 GMT</pubDate>
    <dc:creator>StatDave</dc:creator>
    <dc:date>2023-05-31T16:41:19Z</dc:date>
    <item>
      <title>SAS code for two-part model for healthcare costs</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/SAS-code-for-two-part-model-for-healthcare-costs/m-p/878417#M43434</link>
      <description>&lt;P&gt;Hi everyone,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am conducting cost analysis using claims data and it has a large number of zeroes. I would like to use the two-part model but I am not familiar with the code. I have performed the Modified Park test and know that gamma distribution with a log link fits the best.&amp;nbsp;&lt;/P&gt;&lt;P&gt;As far as I know, the first step is to conduct logistic regression to assign the probability of 0 if the cost is 0 and 1 if the cost is non-zero and get predicted values from this model. I am not sure what to do next once I get predicted values.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I referred to this article -&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="https://support.sas.com/resources/papers/proceedings15/3600-2015.pdf" target="_blank"&gt;https://support.sas.com/resources/papers/proceedings15/3600-2015.pdf&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help on this would be appreciated. Thanks!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 31 May 2023 13:43:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/SAS-code-for-two-part-model-for-healthcare-costs/m-p/878417#M43434</guid>
      <dc:creator>SSK_011523</dc:creator>
      <dc:date>2023-05-31T13:43:29Z</dc:date>
    </item>
    <item>
      <title>Re: SAS code for two-part model for healthcare costs</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/SAS-code-for-two-part-model-for-healthcare-costs/m-p/878489#M43435</link>
      <description>&lt;P&gt;See &lt;A href="http://support.sas.com/kb/68202" target="_self"&gt;this note&lt;/A&gt; on modeling continuous response data with many zeros. The Tweedie distribution is commonly used since it can accommodate positive data with many zeros. Also, as mentioned in the note, PROC HPCDM in SAS/ETS fits a compound model to loss data consisting of both probability and amount of loss.&lt;/P&gt;</description>
      <pubDate>Wed, 31 May 2023 16:41:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/SAS-code-for-two-part-model-for-healthcare-costs/m-p/878489#M43435</guid>
      <dc:creator>StatDave</dc:creator>
      <dc:date>2023-05-31T16:41:19Z</dc:date>
    </item>
    <item>
      <title>Re: SAS code for two-part model for healthcare costs</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/SAS-code-for-two-part-model-for-healthcare-costs/m-p/878974#M43462</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you so much for your response.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I don't think I can use Tweedie distribution when my data has gamma distribution (variance is proportional to the square of the mean) as confirmed by the Modified Park test. Most of the studies analyzing health care costs have used GLM( gamma distribution with log link ) as it takes care of heteroskedasticity.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This article explains the two-part model on page 490 but doesn't provide SAS code :&amp;nbsp;&lt;A href="https://www.annualreviews.org/doi/pdf/10.1146/annurev-publhealth-040617-013517" target="_blank"&gt;https://www.annualreviews.org/doi/pdf/10.1146/annurev-publhealth-040617-013517&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 02 Jun 2023 12:58:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/SAS-code-for-two-part-model-for-healthcare-costs/m-p/878974#M43462</guid>
      <dc:creator>SSK_011523</dc:creator>
      <dc:date>2023-06-02T12:58:09Z</dc:date>
    </item>
    <item>
      <title>Re: SAS code for two-part model for healthcare costs</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/SAS-code-for-two-part-model-for-healthcare-costs/m-p/879072#M43463</link>
      <description>&lt;P&gt;The Tweedie model also allows for heteroscedasticity - as with gamma, the variance is a function of the mean. It is a compound model combining Poisson and gamma - see the "Details: Tweedie Distribution for Generalized Linear Models" section of the GENMOD documentation. Also, the compound model fit by PROC HPCDM *is* a two-part model for frequency of loss and severity of loss - the frequency model can be Poisson, negative binomial, or zero-inflated model, and the loss model is automatically selected as the best fitting among many continuous distributions including gamma and others. See the Getting Started example in the PROC HPCDM documentation.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The paper you refer to doesn't seem to suggest a specific model. It suggests first fitting a logit or probit model to the binary response of zero or positive cost, and a GLM based on a chosen distribution on the positive cost responses. The logit or probit binary response, first part model can be fit in PROC LOGISTIC or PROC GENMOD. The GLM second part, continuous response model can be fit using PROC GENMOD. PROC SEVERITY in SAS/ETS can automatically select the best distribution (as is done with HPCDM). The ASSESS statement in GENMOD allows for testing the adequacy of the link function and the specified form of the model.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For count responses, it mentions the two-part hurdle model which can be fit with PROC FMM (see &lt;A href="https://support.sas.com/kb/48/506.html" target="_self"&gt;this note&lt;/A&gt;). A similar two-part count model is the zero-inflated model which can also be fit in FMM (see "Getting Started: Modeling Zero-Inflation" in the FMM documentation) and in PROC GENMOD and PROC GAMPL. See the "Details: Zero-inflated models" section in the GENMOD documentation. FMM could be used in a similar way, as shown for the hurdle and zero-inflated models, to fit a zero-inflated gamma model.&lt;/P&gt;</description>
      <pubDate>Fri, 02 Jun 2023 16:24:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/SAS-code-for-two-part-model-for-healthcare-costs/m-p/879072#M43463</guid>
      <dc:creator>StatDave</dc:creator>
      <dc:date>2023-06-02T16:24:58Z</dc:date>
    </item>
  </channel>
</rss>

