Testing Difference in Mean Between Two Specific Periods - 'MULTIPLE TIMES'.

Reply
Contributor
Posts: 33

Testing Difference in Mean Between Two Specific Periods - 'MULTIPLE TIMES'.

Hi, 

 

I need to test the difference in mean between two time periods in my data set. However, I need to carry out this test multiple times.

 

Precisely, for each 'date' in my data; I need to test the difference between the mean of 'X1' for '30 days before' and '30 days after'.

 

The code below works perfectly, but I need to manually specify each date (which is obviously too cumbersome). I would sincerely appreciate some suggestions!

 

 

Present Code:

/*date of interest: 16May2001*/

 

data xx;
set x;

 

Startdate1 = '16May2001'd - 30;
Enddate1 = '16May2001'd;

 

Startdate2 = '17May2001'd;
Enddate2 = '17May2001'd + 30;

 

If Startdate1 < date < Enddate1 then flag=0;
If Startdate2 < date < Enddate2 then flag=1;

 

run;

 

 

proc ttest data=xx;

 

class flag;

var X1;
run;

 

 

 

Sample Data:

 

(Date Format: mmddyear)

 

DateX1
1/2/200122
5/16/200134
5/26/200165
5/27/200111
6/8/200156
10/11/200131
12/12/200191
1/13/200449
7/18/200556
9/19/200621
12/2/200767
12/21/200734
12/29/200767
1/23/200890
7/14/200878
9/25/200845
6/28/200945
1/29/201054

 

 

Thank you in anticipation!

Esteemed Advisor
Posts: 5,521

Re: Testing Difference in Mean Between Two Specific Periods - 'MULTIPLE TIMES'.

Here is one approach using a self join:

 

data have;
input Date :mmddyy. X1; 
format date yymmdd10.;
datalines;
1/2/2001 22 
5/16/2001 34 
5/26/2001 65 
5/27/2001 11 
6/8/2001 56 
10/11/2001 31 
12/12/2001 91 
1/13/2004 49 
7/18/2005 56 
9/19/2006 21 
12/2/2007 67 
12/21/2007 34 
12/29/2007 67 
1/23/2008 90 
7/14/2008 78 
9/25/2008 45 
6/28/2009 45 
1/29/2010 54 
;

proc sql;
create table test as
select
    a.date as middate,
    b.date > a.date as period,
    b.date,
    b.x1
from 
    have as a inner join
    have as b 
      on b.date between intnx('day',a.date,-30) and intnx('day',a.date,30) and
      a.date ne b.date
group by middate
having count(distinct period) = 2
order by middate, date;
quit;

ods select none;

proc ttest data=test plots=none;
ods output ttests=tt;
by middate;
class period;
var x1;
run;

ods select all;

proc print data=tt; run;
PG
Contributor
Posts: 33

Re: Testing Difference in Mean Between Two Specific Periods - 'MULTIPLE TIMES'.

Thank you for your response, PG Stats.

 

Is there a way to do this without the 'datalines' in the code? My actual data has close to 500 observations and the 'datalines approach' would be kinda clumsy for me.

 

 

Esteemed Advisor
Posts: 5,521

Re: Testing Difference in Mean Between Two Specific Periods - 'MULTIPLE TIMES'.

Of course, there are many ways. The first step in the code above is just there for testing. Create your have dataset with a datastep using infile or a SQL query or any other means. As long as it contains the date and x1 variables.

PG
Contributor
Posts: 33

Re: Testing Difference in Mean Between Two Specific Periods - 'MULTIPLE TIMES'.

Hi PGStats,

 

Unfortunately, I am still struggling with the codes you supplied (please pardon my basic SAS knowledge).

 

I am using the data-set approach because I have already created a data-set from my main data file. I have over a thousand observations, I don't think the 'datalines approach' would work well for me.

 

However, I still have lots of errors in the log file. I am not too sure what is wrong.

 

 

See the modified code I used below:

 

data xxx;
set xx;

 

input Date :mmddyy. X1;
format date yymmdd10.;

 

proc sql;
create table test as
select
a.date as middate,
b.date > a.date as period,
b.date,
b.x1
from
have as a inner join
have as b
on b.date between intnx('day',a.date,-30) and intnx('day',a.date,30) and
a.date ne b.date
group by middate
having count(distinct period) = 2
order by middate, date;
quit;

ods select none;

 

proc ttest data=test plots=none;
ods output ttests=tt;
by middate;
class period;
var x1;
run;

ods select all;

 

proc print data=tt; run;

 

 

See some errors from log file:

 

ERROR: No DATALINES or INFILE statement.
NOTE: The SAS System stopped processing this step because of errors.

 

 

 

ERROR: Variable MIDDATE not found.

 

 

 

ERROR: File WORK.TT.DATA does not exist

 

Esteemed Advisor
Posts: 5,521

Re: Testing Difference in Mean Between Two Specific Periods - 'MULTIPLE TIMES'.

The first thing you must do is to bring your data into a SAS dataset. That is often the first difficulty that new users encounter. Where is your main data file?

PG
Contributor
Posts: 33

Re: Testing Difference in Mean Between Two Specific Periods - 'MULTIPLE TIMES'.

I have attached the main data file herein.

 

 

Contributor
Posts: 33

Re: Testing Difference in Mean Between Two Specific Periods - 'MULTIPLE TIMES'.

I doubt I had issues with the data import. My main challenge was with conducting the 'proc t test' multiple times - using the dates in my data file as a reference.

 

I have attached my full codes herein.

 

Thank you for your help, PGStats.

Attachment
Ask a Question
Discussion stats
  • 7 replies
  • 155 views
  • 0 likes
  • 2 in conversation