Help using Base SAS procedures

proc ttest question

Reply
Frequent Contributor
Posts: 105

proc ttest question

Hi guys,

I have a strange issue with proc ttest, something I must be missing..I'm using the 'paired' ttest as I have two different variables I am trying to compare to show they have statistically equal means accross by groups, but the issue is, it seems to be showing very low p-values, indicating rejection of the null hypothesis that means are equal, when the variables are equal down to the 6th-7th decimal place, but showing higher p-values inducating failure to reject the null hypothesis for by groups where the variables are actually nontrivially different.. So it seems like my null hypothesis is reversed, I've tried using the default options (not setting h0 option or anything else), and specifying specific options for the test & null hypothesis. Here is my code:

v1:

proc ttest data=forttest test=diff h0=0;

  by loan_price_cat y9c;

  paired aggr_roll_on_rate*roll_on_rate;

run;

v2:

proc ttest data=forttest test=ratio h0=1;

  by loan_price_cat y9c;

  paired aggr_roll_on_rate*roll_on_rate;

run;

v3:

proc ttest data=forttest;

  by loan_price_cat y9c;

  paired aggr_roll_on_rate*roll_on_rate;

run;

And for example these are the manually calculated ratios of the vars 'aggr_roll_on_rate / roll_on_rate' for the first by group, which the above code (all 3 versions) gives p-value < .0001, rejecting h0:

1.000003427

1.0000059207

1.0000023787

1.0000021402

1.0000017226

1.0000029146

1.0000034323

1.0000064012

1.0000019977

1.0000026861

1.000001868

1.0000023016

1.0000015753

1.0000018216

1.0000020762

1.0000013303

1.0000018023

1.0000018581

1.00000197

So surely this null hypothesis should NOT be rejected here, right? Am I missing something??

-Thank you so much for your time.

Frequent Contributor
Posts: 83

Re: proc ttest question

Posted in reply to AllSoEasy

What does the data actually look like? The ratios don't really indicate anything other than the paired groups are similar, but there may be something that you are missing. If you could post even 10 lines of data which when you run this should up incorrectly that would help. Additionally what does the output look like when you run

proc means data = forttest;

class loan_price_cat y9c;

var aggr_roll_on_rate roll_on_rate;

run;

This at least will give us an idea of what the means of the two groups are and it the means in this case are drastically different may explain why you are being told to reject H0

Frequent Contributor
Posts: 105

Re: proc ttest question

loan_price_catdatemonth_endAggr_Roll_on_RateY9CRoll_On_Rateroll_on_ratio
Commercial & Industrial Loans, SHAW Business IL Fixed12/31/20092009124.749967444C&I Loans4.7499837221.000003427
Commercial & Industrial Loans, SHAW Business IL Fixed01/31/20102010014.749943754C&I Loans4.7499718771.000005921
Commercial & Industrial Loans, SHAW Business IL Fixed02/28/20102010024.749977403C&I Loans4.7499887011.000002379
Commercial & Industrial Loans, SHAW Business IL Fixed03/31/20102010034.749979668C&I Loans4.7499898341.00000214
Commercial & Industrial Loans, SHAW Business IL Fixed04/30/20102010044.760706983C&I Loans4.7607151841.000001723
Commercial & Industrial Loans, SHAW Business IL Fixed05/31/20102010054.749972311C&I Loans4.7499861561.000002915
Commercial & Industrial Loans, SHAW Business IL Fixed06/30/20102010064.701572491C&I Loans4.7015886281.000003432
Commercial & Industrial Loans, SHAW Business IL Fixed07/31/20102010074.749939189C&I Loans4.7499695941.000006401
Commercial & Industrial Loans, SHAW Business IL Fixed08/31/20102010084.749981022C&I Loans4.7499905111.000001998
Commercial & Industrial Loans, SHAW Business IL Fixed09/30/20102010094.749974482C&I Loans4.7499872411.000002686
Commercial & Industrial Loans, SHAW Business IL Fixed10/31/20102010104.749982254C&I Loans4.7499911271.000001868
Commercial & Industrial Loans, SHAW Business IL Fixed11/30/20102010114.749978135C&I Loans4.7499890681.000002302
Commercial & Industrial Loans, SHAW Business IL Fixed12/31/20102010124.749985035C&I Loans4.7499925171.000001575
Commercial & Industrial Loans, SHAW Business IL Fixed01/31/20112011014.749982694C&I Loans4.7499913471.000001822
Commercial & Industrial Loans, SHAW Business IL Fixed02/28/20112011024.749980277C&I Loans4.7499901381.000002076
Commercial & Industrial Loans, SHAW Business IL Fixed03/31/20112011034.749987362C&I Loans4.7499936811.00000133
Commercial & Industrial Loans, SHAW Business IL Fixed04/30/20112011044.717541853C&I Loans4.7175503561.000001802
Commercial & Industrial Loans, SHAW Business IL Fixed05/31/20112011054.749982348C&I Loans4.7499911741.000001858
Commercial & Industrial Loans, SHAW Business IL Fixed06/30/20112011064.749981285C&I Loans4.7499906431.00000197
Commercial & Industrial Loans, SHAW Business IL Fixed07/31/20112011074.749985529C&I Loans4.7499927651.000001523
Commercial & Industrial Loans, SHAW Business IL Fixed08/31/20112011084.742211513C&I Loans4.7422188861.000001555
Commercial & Industrial Loans, SHAW Business IL Fixed09/30/20112011094.745776323C&I Loans4.7457823741.000001275
Commercial & Industrial Loans, SHAW Business IL Fixed10/31/20112011104.749986761C&I Loans4.7499933811.000001394

overmar, thank you for your reply. This is the first by group of data, Please let me know if this helps indicate the problem.

New Contributor
Posts: 4

Re: proc ttest question

Posted in reply to AllSoEasy

The result is correct. Do not look complete values but the differences after the third decimal values. Remember the properties of the variance of a variable, it remain invariante when adding or subtracting a constant. Analyze original value minus 4.74 and results must be equal. (excuse my poor english).

This is the analysis with Excel

Prueba t para medias de dos muestras emparejadas
Aggr_Roll_on_RateRoll_On_Rate
Media4.7464076574.746419083
Varianza0.0001497770.000149767
Observaciones2323
Coeficiente de correlación de Pearson0.999999868
Diferencia hipotética de las medias0
Grados de libertad22
Estadístico t-8.702547638
P(T<=t) una cola7.0983E-09
Valor crítico de t (una cola)1.717144335
P(T<=t) dos colas1.41966E-08
Valor crítico de t (dos colas)2.073873058
Frequent Contributor
Posts: 105

Re: proc ttest question

fbabinec,

Thank you for your reply. I am interested in seeing your excel work to produce these conclusions. Would you mind sharing the excel sheet with me?

Thanks!!

Super User
Posts: 19,861

Re: proc ttest question

Posted in reply to AllSoEasy

You appear to have time series data, I'm not sure a t-test is the appropriate statistical test for your data.

Frequent Contributor
Posts: 105

Re: proc ttest question

Hmmm...Is there are a SAS procedure you may recommend for taking time series data and showing that the two series are essentially equal?

New Contributor
Posts: 4

Re: proc ttest question

Posted in reply to AllSoEasy

My answer was focused on the apparent lack of precision of PROC TTEST and I have overlooked the serial nature of the data. I have used the Excel data analysis toolpack (Tools-> Data Analysis Toolpack in older versions, Data ->  Data Analysis Toolpack in Excel 2007/2010) for simplicity. I am not sure if time series analysis solve your problem due to the paired nature of your data. I think repeated measures models with PROC GLM or PROC MIXED help you better. Or use advanced features of PROC TTEST (http://www2.sas.com/proceedings/sugi31/208-31.pdf). But your first analysis is just right.

Below is a simple example of repeated measures analysis with  PROC MIXED, but statements are approximate:

*** create a new dataset;

data  newtest; set forttest;

state="aggr"; rate=aggr_roll_on_rate; output;

state="end"; rate=roll_on_rate;

drop aggr_roll_on_rate roll_on_rate;

*** perform the analysis;

proc mixed;

class loan_price_cat y9c date state;

model rate=loan_price_cat y9c | state;

lsmeans loan_price_cat y9c * state / slice=loan_price_cat y9c;

repeated date / type=cs sub=loan_price_cat y9c;

*** and other models (AR, UN, etc);

run;

Ask a Question
Discussion stats
  • 7 replies
  • 388 views
  • 6 likes
  • 4 in conversation