<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: proc genmod: check of model fit in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/proc-genmod-check-of-model-fit/m-p/512532#M26197</link>
    <description>&lt;P&gt;DONE&lt;/P&gt;</description>
    <pubDate>Tue, 13 Nov 2018 13:40:18 GMT</pubDate>
    <dc:creator>Rick_SAS</dc:creator>
    <dc:date>2018-11-13T13:40:18Z</dc:date>
    <item>
      <title>proc genmod: check of model fit</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/proc-genmod-check-of-model-fit/m-p/512181#M26191</link>
      <description>&lt;P&gt;Hello together,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;we are trying to fit a GLM with proc genmod.&lt;/P&gt;&lt;P&gt;The dependent variable is health cost data and independent variables are group of treatment, age, sex, observation time, different comorbidities and different medications.&lt;/P&gt;&lt;P&gt;For modelling the costs, we assumed a gamma distribution und a log link (Meanwhile we also tried other links and distributions).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Now, we are interested to check the goodness of fit of the model.&lt;/P&gt;&lt;P&gt;For this we examined the plot of estimated versus observed costs and the errors versus observed costs.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But in our opinion both plots contradict a good model fit (see attached file). The estimated and observed costs vary randomly whereas the errors show a strong relationship to the observed costs.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Our question is: Are these plots a correct, plausible way to check the model fit for a GLM?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If yes, is there any way to improve the model fit?&lt;/P&gt;&lt;P&gt;We already tried all different link and distribution functions and transformations of the cost data itself.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The cost data are heavily skewed and include zero cost as well as very high costs. But those low and high costs are of interest as well.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Our program:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc genmod data=input_data PLOTS=(PREDICTED RESCHI);
     class group sex 
	   comorbidity1 comorbidity2 ... /* all 1/0 - Variables */
	   medication1 medication2 ...  /* all 1/0 - Variables */
		;
  model cost =  group 
		age obeservation_time 
		sex CCI_Score
		comobidity1 comorbidity2 ... /* all 1/0 - Variables */
		medication1 medication2 ...  /* all 1/0 - Variables */
	/ dist = gamma link = log ; 

  output out       = Residuals
         pred      = Pred
         resraw    = Resraw
         reschi    = Reschi&lt;BR /&gt;         ;
run;

title 'Proc genmod: Plot of estimated and residuals';
proc gplot data=residuals;
plot pred*cost Reschi*cost;
label cost='cost';
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Thanks for an answer&lt;/P&gt;&lt;P&gt;sasstats&lt;/P&gt;</description>
      <pubDate>Mon, 12 Nov 2018 13:34:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/proc-genmod-check-of-model-fit/m-p/512181#M26191</guid>
      <dc:creator>sasstats</dc:creator>
      <dc:date>2018-11-12T13:34:57Z</dc:date>
    </item>
    <item>
      <title>Re: proc genmod: check of model fit</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/proc-genmod-check-of-model-fit/m-p/512520#M26195</link>
      <description>&lt;P&gt;I agree&amp;nbsp;that these two plots do not indicate a good fit. However, when you have multiple variables, you need to be a little careful when you create plots like this. You are projecting&amp;nbsp;the predicted responses onto one dimension (cost), whereas a better approach is to slice the predicted response surface. &lt;A href="https://blogs.sas.com/content/iml/2016/06/22/sas-effectplot-statement.html" target="_self"&gt;You can use the EFFECTPLOT statement (with the FIT or SLICEFIT options) to create a more effective visualization of the response surface.&lt;/A&gt;&amp;nbsp; Personally, I don't think it will matter in terms of assessing fit, but the EFFECTPLOT statement is a powerful diagnostic tool that is worth learning about. It should be helpful as you refine your model.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;gt;&amp;nbsp;&lt;SPAN&gt;is there any way to improve the model fit?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;We don't really have enough information to answer that question. Two possible approaches are:&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;1. You can adopt a model-building approach in which you incrementally build up the model based on domain-specific knowledge and looking at the fit statistics. You might be missing interaction terms or nonlinear terms in the model. &lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;2. You can adopt a "shotgun" approach and use PROC GLMSELECT&amp;nbsp;or PROC HPGENSELECT to&amp;nbsp;select the model effects that best fit the data. If you choose to use variable selection, you should consider using crossvalidation&amp;nbsp;to avoid overfitting the data. If you aren't&amp;nbsp;familiar with the model selection procedures, here are two references:&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="http://support.sas.com/resources/papers/proceedings16/SAS4900-2016.pdf" target="_self"&gt;"Statistical Model Building for Large, Complex Data: Five New Directions in SAS/STAT® Software"&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://support.sas.com/resources/papers/proceedings15/SAS1742-2015.pdf" target="_self"&gt;"Introducing the HPGENSELECT Procedure: Model Selection for Generalized Linear Models and More"&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Nov 2018 13:39:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/proc-genmod-check-of-model-fit/m-p/512520#M26195</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2018-11-13T13:39:52Z</dc:date>
    </item>
    <item>
      <title>Re: proc genmod: check of model fit</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/proc-genmod-check-of-model-fit/m-p/512529#M26196</link>
      <description>&lt;P&gt;Hello Rick_SAS&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;thanks for your answer and the helpful hints.&lt;/P&gt;&lt;P&gt;It might be that there are interactions between our independent variables. We have to check this.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;One request we do have:&lt;/P&gt;&lt;P&gt;Could you please give the correct link for your first reference, if possible?&lt;/P&gt;&lt;P&gt;Now, It leads to the same paper as you second reference.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you very much&lt;/P&gt;&lt;P&gt;sasstats&lt;/P&gt;</description>
      <pubDate>Tue, 13 Nov 2018 13:23:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/proc-genmod-check-of-model-fit/m-p/512529#M26196</guid>
      <dc:creator>sasstats</dc:creator>
      <dc:date>2018-11-13T13:23:29Z</dc:date>
    </item>
    <item>
      <title>Re: proc genmod: check of model fit</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/proc-genmod-check-of-model-fit/m-p/512532#M26197</link>
      <description>&lt;P&gt;DONE&lt;/P&gt;</description>
      <pubDate>Tue, 13 Nov 2018 13:40:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/proc-genmod-check-of-model-fit/m-p/512532#M26197</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2018-11-13T13:40:18Z</dc:date>
    </item>
  </channel>
</rss>

