Solved: Likelihood Ratio Test for ARIMA Models

sasalex2024 · Posted 12-28-2024 10:39 AM

Dear SAS Community,

I would appreciate your assistance with a question regarding the discrimination between two ARIMA models using a likelihood ratio test.

Suppose, the full model is as follows:

proc arima data=a;
   identify var=zt;
   estimate p=3 q=0 method=ml;
run;

The restricted model is:

proc arima data=a; 
identify var=zt; 
estimate p=(1 3) q=0 method=ml; 
run;

I understand that one approach is to manually compute the test statistic using the formula:

$LR=2*(LL_Full−LL_Restricted)$ and then compare it with the chi-squared distribution with the appropriate degrees of freedom.

However, I was wondering if there is a more efficient or automated way to perform this test within SAS (for any arbitrary full vs restricted ARIMA models), rather than calculating it manually.

Any guidance or suggestions would be greatly appreciated!

Thank you in advance for your help.

Rick_SAS · Posted 12-30-2024 05:58 AM

Apologies. Try this one:

/* LR test: If 'Pr > ChiSq' value exceeds 0.05, the model with more parameters is preferred */
data LRTEST;    
   label LLFull="LL(Full)" LLRed="LL(Reduced)" DF="DF" 
         LR_Test='LR_Stat' pValue = 'Pr > ChiSq';
   format pValue PVALUE6.4;

   retain LLFull LLRed LR_Test DF pValue; 
   merge LLFULL(keep=_STAT_ _VALUE_ rename=(_VALUE_=valueFull))
       LLRED(keep=_STAT_ _VALUE_ rename=(_VALUE_=valueRed));

   if _STAT_ = "LOGLIK" then do;
     LLFull = valueFull;
     LLRed = valueRed; 
     LR_Test = 2 * abs(LLFull - LLRed);
   end;
   else if _STAT_ = "NPARMS" then do
     DF = valueFull - valueRed;
     pValue = sdf("ChiSq", LR_Test, DF); 
     output;
   end;
   drop _STAT_ valueFull valueRed;
run;

View solution in original post

Ksharp · Posted 12-28-2024 08:14 PM

You could save these Likelihood and perform this test by code(or SAS/IML).
@Rick_SAS blog could give you a hand:
https://blogs.sas.com/content/iml/2024/03/27/likelihood-ratio-test.html

sasalex2024 · Posted 12-29-2024 12:14 AM

Dear Ksharp,

Thank you very much for the prompt and insightful reply. I've used the modified version of the code you referred to, and though my version perhaps is not the most efficient one, it seems to work. Do you think it can be accepted as a solution? Here is the code below.

/* ARIMA Model 1 (with more parameters) */
proc arima data=a plots=none;
identify var=zt;
estimate p=3 q=0 method=ml outstat=LLFull;
run;

/* ARIMA Model 2 (with fewer parameters) */
proc arima data=a plots=none;
identify var=zt;
estimate p=(1  3) q=0 method=ml outstat=LLRed;
run;

/* NOTE: If 'Pr > ChiSq' value exceeds 0.05, the model with more parameters is preferred */
data LRTEST;    
    retain LR_Test; 
    merge LLFULL(keep=_VALUE_ rename=(_VALUE_=valueFull))
          LLRED(keep=_VALUE_ rename=(_VALUE_=valueRed));
    
    if _N_ = 3 then
        LR_Test = 2 * abs(valueFull - valueRed);
    if _N_ = 6 then do
        DF = valueFull - valueRed;
        pValue = sdf("ChiSq", LR_Test, DF); 
    end;
    
    label LR_Test='LR_Stat' pValue = 'Pr > ChiSq';
    drop valueFull valueRed DF;
run;

proc print data=LRTEST (firstobs=6 obs=6) label noobs;
run;

Ksharp · Posted 12-29-2024 01:14 AM

Your code looks good. Maybe @Rick_SAS could answer your question.

Rick_SAS · Posted 12-29-2024 06:20 AM

Your code looks correct. I suggest a few minor changes for readability and efficiency:

You don't need to call PROC ARIMA twice. Call it once and use two ESTIMATE statements.
Use readable code such as IF _STAT_ = "LOGLIK" instead of using IF _N_ = 3.
Include an OUTPUT statement so the final data set has one row.

data A;
call streaminit(123);
do i = 1 to 100;
   zt = rand("Normal");
   output;
end;
run;


/* ARIMA Models */
proc arima data=a plots=none;
identify var=zt;
Full: estimate p=3 q=0 method=ml outstat=LLFull;
Reduced: estimate p=(1  3) q=0 method=ml outstat=LLRed;
run;
quit;

/* NOTE: If 'Pr > ChiSq' value exceeds 0.05, the model with more parameters is preferred */
data LRTEST;    
   label valueFull="LL(Full)" valueRed="LL(Reduced)" DF="DF" 
         LR_Test='LR_Stat' pValue = 'Pr > ChiSq';
   format pValue PVALUE6.4;

   retain LR_Test DF pValue; 
   merge LLFULL(keep=_STAT_ _VALUE_ rename=(_VALUE_=valueFull))
       LLRED(keep=_STAT_ _VALUE_ rename=(_VALUE_=valueRed));

   if _STAT_ = "LOGLIK" then
     LR_Test = 2 * abs(valueFull - valueRed);
   else if _STAT_ = "NPARMS" then do
     DF = valueFull - valueRed;
     pValue = sdf("ChiSq", LR_Test, DF); 
     output;
   end;
   drop _STAT_;
run;

proc print data=LRTEST label noobs;
run;

sasalex2024 · Posted 12-29-2024 09:16 AM

Thank you very much Rick_SAS, this is way better than mine. The only thing is, the printed table at the end doesn't display LL values (LOGLIK) in the 1st two columns, but apparently the number of paramaters in those model.

Rick_SAS · Posted 12-30-2024 05:58 AM

Apologies. Try this one:

/* LR test: If 'Pr > ChiSq' value exceeds 0.05, the model with more parameters is preferred */
data LRTEST;    
   label LLFull="LL(Full)" LLRed="LL(Reduced)" DF="DF" 
         LR_Test='LR_Stat' pValue = 'Pr > ChiSq';
   format pValue PVALUE6.4;

   retain LLFull LLRed LR_Test DF pValue; 
   merge LLFULL(keep=_STAT_ _VALUE_ rename=(_VALUE_=valueFull))
       LLRED(keep=_STAT_ _VALUE_ rename=(_VALUE_=valueRed));

   if _STAT_ = "LOGLIK" then do;
     LLFull = valueFull;
     LLRed = valueRed; 
     LR_Test = 2 * abs(LLFull - LLRed);
   end;
   else if _STAT_ = "NPARMS" then do
     DF = valueFull - valueRed;
     pValue = sdf("ChiSq", LR_Test, DF); 
     output;
   end;
   drop _STAT_ valueFull valueRed;
run;

Likelihood Ratio Test for ARIMA Models

Re: Likelihood Ratio Test for ARIMA Models

Re: Likelihood Ratio Test for ARIMA Models

Re: Likelihood Ratio Test for ARIMA Models

Re: Likelihood Ratio Test for ARIMA Models

Re: Likelihood Ratio Test for ARIMA Models

Re: Likelihood Ratio Test for ARIMA Models

Re: Likelihood Ratio Test for ARIMA Models

Registration is open