<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: multiple linear regression in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356904#M18728</link>
    <description>&lt;P&gt;How many observations do you have?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Are all your variables continuous or did you create any indicator variables?&lt;/P&gt;</description>
    <pubDate>Mon, 08 May 2017 15:30:19 GMT</pubDate>
    <dc:creator>Reeza</dc:creator>
    <dc:date>2017-05-08T15:30:19Z</dc:date>
    <item>
      <title>multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356899#M18727</link>
      <description>&lt;P&gt;I use this code to do multiple linear regression:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC REG DATA=WORK.For_Reg
	PLOTS(maxpoints=10000)=ALL
;
Linear_Regression_Model:
	MODEL Ln_Amount = ABDOM_HERNIA ADD_PROC ADV_DIABETES BLEED_DISORDERS BR_PR_COL_GI_CANCER
		CHF_CARDIOMYO_VALVDIS CHRON_RENAL_FAIL CONVULS CP_MS_OTHER DEP_BIPOLAR_PARA DIAG_PROC DIAG_WO_ADD DIAG_W_ADD
		DIG_CONG_ANOM DIVERTICULITIS DRUG_REACT DVT_PE GASTRDUO_ULCER GAST_DUODENITIS GI_BLEED GOUT HEP_CIRR_OTR_LIVER IBS
		INTEST_INF INTEST_OBSTR MILD_SLEEP_APNEA MORBID_OBESITY NUTR_OTR_ANEMIA OBESITY RESP_FAILURE RHEU_ARTH SCREEN_WO_ADD
		SCREEN_W_ADD SHOCK_SYNC STROKE_PARAL TOBACCO
		/ SELECTION=BACKWARD
		SLS=0.15
		INCLUDE=0
		STB CORRB CLB
		PCORR1 PCORR2
		ALPHA=0.1
		COLLIN
	;
RUN;

QUIT;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size="3"&gt;I put in 36 independent variables, but SAS dropped&amp;nbsp;3 vars at the begining, the step 0. Why did sas do that? please see the output.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size="3"&gt;the output shows:&lt;/FONT&gt;&lt;/P&gt;
&lt;P class="proctitle"&gt;&lt;FONT color="#FF0000"&gt;&lt;STRONG&gt;&lt;SPAN class="proctitle"&gt;The model is not of full rank. A subset of model which is of full rank is choosen.&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P class="proctitle"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 08 May 2017 15:32:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356899#M18727</guid>
      <dc:creator>zhuxiaoyan1</dc:creator>
      <dc:date>2017-05-08T15:32:48Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356904#M18728</link>
      <description>&lt;P&gt;How many observations do you have?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Are all your variables continuous or did you create any indicator variables?&lt;/P&gt;</description>
      <pubDate>Mon, 08 May 2017 15:30:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356904#M18728</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2017-05-08T15:30:19Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356906#M18729</link>
      <description>&lt;P&gt;This is based upon the mathematics of linear regression.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;When the message says that the model is not full rank, this means that three of your variables are identical to linear combinations of the other variables, and so cannot be estimated.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You need to determine why your variables are indentical to linear combinations of the other variables, and then delete the variables to eliminate this identicality.&lt;/P&gt;</description>
      <pubDate>Mon, 08 May 2017 15:31:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356906#M18729</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2017-05-08T15:31:36Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356913#M18731</link>
      <description>&lt;P&gt;I also add my usual advice against Stepwise regression, and that when you have 36 independent variables, you'd be much better off performing Partial Least Squares regression (PROC PLS), rather than stepwise or ordinary least squares regression.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you search for "problems with stepwise regression" you will find lots of authors writing on this topic. PLS has been shown to provide better estimates of slopes and predicted values than OLS/stepwise (better meaning lower mean squared error), see &lt;A href="http://amstat.tandfonline.com/doi/abs/10.1080/00401706.1993.10485033" target="_blank"&gt;http://amstat.tandfonline.com/doi/abs/10.1080/00401706.1993.10485033&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 08 May 2017 15:42:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356913#M18731</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2017-05-08T15:42:41Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356916#M18732</link>
      <description>I have 8048 obs. My variables are indicator variables, not continuous, but my dependent variables are continuous variables. I take logarithm on the Amount. You can see the name Ln_Amount. This is actually a Poisson regression.</description>
      <pubDate>Mon, 08 May 2017 15:55:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356916#M18732</guid>
      <dc:creator>zhuxiaoyan1</dc:creator>
      <dc:date>2017-05-08T15:55:50Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356917#M18733</link>
      <description>&lt;P&gt;ln_amount could easily be loan amount, we can't make inferences based on variable names.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If your variables are indicator variables did you remember to exclude one for each categorical variable?&lt;/P&gt;
&lt;P&gt;ie for a variable with 6 categories you only have 5 variables?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To determine which variable is the issue you can run a proc freq between them to do the comparisons.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 08 May 2017 16:00:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356917#M18733</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2017-05-08T16:00:44Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356924#M18734</link>
      <description>All my independent variables are flag as 0 or 1. The Amount is Allowed Amount. My data is health insurance claim data. Insurance company has contract with providers how much they can charge for a particular procedure.</description>
      <pubDate>Mon, 08 May 2017 16:19:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356924#M18734</guid>
      <dc:creator>zhuxiaoyan1</dc:creator>
      <dc:date>2017-05-08T16:19:28Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356927#M18735</link>
      <description>I checked. There is only one identical to other one in the model. But I don't know what happened to the other two.</description>
      <pubDate>Mon, 08 May 2017 16:31:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356927#M18735</guid>
      <dc:creator>zhuxiaoyan1</dc:creator>
      <dc:date>2017-05-08T16:31:13Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356932#M18736</link>
      <description>&lt;P&gt;You have two questions.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. You get the error because you have categorical variables that end up being identical&lt;/P&gt;
&lt;P&gt;2. Regarding why it dropped three right of the bat - missing data in those variables, or the p-value is less than 0.15 since you set SLS to 0.15. It should produce a table that shows the reason at each step.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 08 May 2017 16:43:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356932#M18736</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2017-05-08T16:43:33Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356935#M18738</link>
      <description>SAS dropped total of 14 variables, but it only shows 11 variables it dropped. The other 3 variables are also dropped, but it did not show. That's why I asked this questions.</description>
      <pubDate>Mon, 08 May 2017 16:48:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356935#M18738</guid>
      <dc:creator>zhuxiaoyan1</dc:creator>
      <dc:date>2017-05-08T16:48:46Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356937#M18739</link>
      <description>&lt;P&gt;Post your output and log then, otherwise we're trying to hit a piñata blindfolded.&lt;/P&gt;</description>
      <pubDate>Mon, 08 May 2017 16:51:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356937#M18739</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2017-05-08T16:51:49Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356940#M18740</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/104137"&gt;@zhuxiaoyan1&lt;/a&gt; wrote:&lt;BR /&gt;I checked. There is only one identical to other one in the model. But I don't know what happened to the other two.&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;The variables that are causing the problem are identical to &lt;EM&gt;linear combinations&lt;/EM&gt; of other variables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So if a*X5+b*X6+c*X14 = X1 for any real values of a, b and c, then you will get the same error.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As I said, this modeling would be much better handled by Partial Least Squares regression (PROC PLS) where these (and other)&amp;nbsp;problems go away.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;Editor’s Note: This video showing how to build multiple linear regression models with and without interaction may be helpful as well.&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;div class="video-embed-left video-embed"&gt;&lt;iframe class="embedly-embed" src="https://cdn.embedly.com/widgets/media.html?url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DgA1UEjxH-Ic&amp;amp;type=text%2Fhtml&amp;amp;schema=google&amp;amp;display_name=YouTube&amp;amp;src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FgA1UEjxH-Ic" width="200" height="112" scrolling="no" title="YouTube embed" frameborder="0" allow="autoplay; fullscreen; encrypted-media; picture-in-picture;" allowfullscreen="true"&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Nov 2020 16:47:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356940#M18740</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2020-11-10T16:47:45Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356965#M18744</link>
      <description>Thank you very much for your help! Is there a way to know a variable is identical to linear combinations of other variables? Thanks again!</description>
      <pubDate>Mon, 08 May 2017 18:20:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356965#M18744</guid>
      <dc:creator>zhuxiaoyan1</dc:creator>
      <dc:date>2017-05-08T18:20:14Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356969#M18745</link>
      <description>&lt;P&gt;PROC FREQ and you'll have only entries along a diagonal.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Knowing your data - sometimes the same variable is defined multiple ways and included or procedures always go together so including both doesn't make sense.&lt;/P&gt;</description>
      <pubDate>Mon, 08 May 2017 18:30:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356969#M18745</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2017-05-08T18:30:35Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356971#M18746</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt; wrote:&lt;BR /&gt;
&lt;P&gt;PROC FREQ and you'll have only entries along a diagonal.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Knowing your data - sometimes the same variable is defined multiple ways and included or procedures always go together so including both doesn't make sense.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;This might find cases where X1=X2, but it won't find cases where a*X1+b*X2+c*X3 = d*X4+e*X5.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;PROC PRINCOMP should work, in the above example it will find cases where a*X1+b*X2+c*X3–d*X4+e*X5 has zero variability (or in PROC PRINCOMP language, the component has a zero eigenvalue). And as I have said repeatedly, the use of Partial Least Squares (PROC PLS) avoids this complication entirely.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 08 May 2017 18:54:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/356971#M18746</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2017-05-08T18:54:45Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/357153#M18758</link>
      <description>&lt;P&gt;Adding to my comment above&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;
&lt;P&gt;&amp;nbsp;And as I have said repeatedly, the use of Partial Least Squares (PROC PLS) avoids this complication entirely.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;And as I have said repeatedly, the use of Partial Least Squares (PROC PLS) avoids this complication entirely, &lt;FONT color="#ff0000"&gt;and has many other benefits when you are trying to model with 36 independent variables&lt;/FONT&gt;.&lt;/P&gt;</description>
      <pubDate>Tue, 09 May 2017 12:23:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/357153#M18758</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2017-05-09T12:23:22Z</dc:date>
    </item>
    <item>
      <title>Re: multiple linear regression</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/357183#M18761</link>
      <description>I'll try proc pls. Thank you very much!</description>
      <pubDate>Tue, 09 May 2017 13:57:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/multiple-linear-regression/m-p/357183#M18761</guid>
      <dc:creator>zhuxiaoyan1</dc:creator>
      <dc:date>2017-05-09T13:57:44Z</dc:date>
    </item>
  </channel>
</rss>

