BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kels123
Quartz | Level 8

Hello,

 

I have a database where subjects have repeated measurements. These measurements are taken during specific time periods delineated by StartDate and EndDate1. Sample code below:

 

data tempfile;

infile datalines truncover;

INPUT

StudyID$2.@+1 count StartDate mmddyy10.@+1 EndDate1 mmddyy10.@+1;

format Startdate EndDate1 mmddyy10.;

datalines;

01 1 02/01/2007 05/01/2008 

02 1 06/02/2012 05/24/2013 

03 1 05/07/2010 08/19/2010 

04 1 06/16/2007 06/21/2008 

05 1 04/28/2007 11/07/2008 

06 1 03/16/2007 06/20/2007 

07 1 04/18/2007 04/19/2007 

08 1 05/11/2007 08/03/2007 

09 1 08/01/2007 08/28/2007 

09 3 02/21/2008 03/05/2008 

10 1 02/06/2008 03/12/2010 

11 1 01/31/2007 12/26/2008 

11 3 02/11/2009 05/08/2009 

11 5 05/19/2009 08/14/2009 

11 7 09/22/2009 12/18/2009 

11 9 01/09/2010 02/03/2010 

11 11 02/23/2010 11/19/2010 

11 13 12/01/2010 03/18/2011 

11 15 05/10/2011 04/23/2013 

11 17 08/03/2013 09/27/2013 

;

run;

proc print data=tempfile; run;

 

Please note that the count variable is left over from a previous data manipulation, which is why there are only odd values instead of sequential numbers (1,3,5,... instead of 1,2,3,...).

 

I want to further group these measurements into 12-month time periods that start on EndDate1; basically I am interested in following individuals for 12 months after EndDate1. The issue is that I am not sure how to write a code that will allow me to reorganize the data this way. I was thinking about trying to create something like this:

 

StudyID

count

StartDate

EndDate1

Date_12mopost

Period

01

1

02/01/2007

05/01/2008

2009-05-01

1

02

1

06/02/2012

05/24/2013

2014-05-24

1

03

1

05/07/2010

08/19/2010

2011-08-19

1

04

1

06/16/2007

06/21/2008

2009-06-21

1

05

1

04/28/2007

11/07/2008

2009-11-07

1

06

1

03/16/2007

06/20/2007

2008-06-19

1

07

1

04/18/2007

04/19/2007

2008-04-18

1

08

1

05/11/2007

08/03/2007

2008-08-02

1

09

1

08/01/2007

08/28/2007

2008-08-27

1

09

3

02/21/2008

03/05/2008

2008-08-27

1

10

1

02/06/2008

03/12/2010

2011-03-12

1

11

1

01/31/2007

12/26/2008

2009-12-26

1

11

3

02/11/2009

05/08/2009

2009-12-26

1

11

5

05/19/2009

08/14/2009

2009-12-26

1

11

7

09/22/2009

12/18/2009

2009-12-26

1

11

9

01/09/2010

02/03/2010

2011-02-03

2 (starting over)

11

11

02/23/2010

11/19/2010

2011-02-03

2

11

13

12/01/2010

03/18/2011

2012-03-18

3 (starting over)

11

15

05/10/2011

04/23/2013

2014-04-23

4 (starting over)

11

17

08/03/2013

09/27/2013

2014-04-23

4

 

This will allow me to do what I need to do with the data to eventually be able to analyze it (for example, eventually transposing the data into the appropriate one-year time period, or selecting the first row of the one-year time period). So I guess my questions might be:

  • How do I assign a date 12 months after the EndDate1 IN THE FIRST ROW FOR EACH SUBJECT (e.g. Subject 09)?
  • How about later on, for repeat measurements, if the EndDate1 takes place AFTER the 12-month mark, and I want to start the clock over again with a new 12 month period (e.g. Subject 11)?
  • Can a create a new “Period” variable to reflect these dates, to make it easier to keep the 12-month time periods organized?

 

So far, I have only managed to create a semi-accurate date variable representing 12 months after EndDate1, using the following code:

 

data tempfile;

  set tempfile;

  by studyID count;

  retain Date12mopost;

  if first.studyID then do;

     Date_12mopost=(EndDate1+365);

  end;

  else do;

     Date_12mopost=Date_12mopost;

  end;

  format Date_12mopost yymmdd10. ;

run;

proc print data = tempfile; run;

 

StudyID

count

StartDate

EndDate1

Date_12mopost

01

1

02/01/2007

05/01/2008

2009-05-01

02

1

06/02/2012

05/24/2013

2014-05-24

03

1

05/07/2010

08/19/2010

2011-08-19

04

1

06/16/2007

06/21/2008

2009-06-21

05

1

04/28/2007

11/07/2008

2009-11-07

06

1

03/16/2007

06/20/2007

2008-06-19

07

1

04/18/2007

04/19/2007

2008-04-18

08

1

05/11/2007

08/03/2007

2008-08-02

09

1

08/01/2007

08/28/2007

2008-08-27

09

3

02/21/2008

03/05/2008

2009-03-05

10

1

02/06/2008

03/12/2010

2011-03-12

11

1

01/31/2007

12/26/2008

2009-12-26

11

3

02/11/2009

05/08/2009

2010-05-08

11

5

05/19/2009

08/14/2009

2010-08-14

11

7

09/22/2009

12/18/2009

2010-12-18

11

9

01/09/2010

02/03/2010

2011-02-03

11

11

02/23/2010

11/19/2010

2011-11-19

11

13

12/01/2010

03/18/2011

2012-03-17

11

15

05/10/2011

04/23/2013

2014-04-23

11

17

08/03/2013

09/27/2013

2014-09-27

 

The Date_12mopost is accurate for each row, but doesn’t help me try to group the periods into the a 12-month time period based on the dates from the preceding row.

 

I recognize this is a multi-step question that is likely extremely complicated. Any guidance, particularly regarding the specifics of the code, would be greatly appreciated.

 

Thank you very much in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

data want;

set tempfile;

by   StudyID;

retain date_12mopost period;

if first.StudyID and last.studyid then do;

date_12mopost= intnx('year', enddate1, 1, 's');

period=1;

end;

else if first.StudyID and not last.studyid then do;

date_12mopost=intnx('year', enddate1, 1, 's');

period=1;

end;

else if not first.studyid then do;

if   enddate1>date_12mopost then do;

date_12mopost=intnx('year', enddate1, 1, 's');

period+1;

end;

end;

format     date_12mopost mmddyy10.;

run;

 

Regards,

Naveen Srinivasan

View solution in original post

3 REPLIES 3
Astounding
PROC Star

Believe it or not, you're not that far away from working code.  First, review this section:

 

  if first.studyID then do;

     Date_12mopost=(EndDate1+365);

  end;

 

You could consider that formula OK, or you could experiment with:

 

date_12mopost = intnx('month', EndDate1, 12, 'same');

 

Second, add to that section:

 

  if first.studyID then do;

     Date_12mopost=(EndDate1+365);

     period=1;

  end;

 

Both variables are going to be retained, so there is no need for this code (remove it):

 

  else do;

     Date_12mopost=Date_12mopost;

  end;

 

Finally, to increment period (and retain it at the same time):

 

if StartDate > Date12mo_post then do;

   period + 1;  /* which will also retain PERIOD */

   Date12mo_post = EndDate1 + 365;  /* or switch formulas */

end;

novinosrin
Tourmaline | Level 20

data want;

set tempfile;

by   StudyID;

retain date_12mopost period;

if first.StudyID and last.studyid then do;

date_12mopost= intnx('year', enddate1, 1, 's');

period=1;

end;

else if first.StudyID and not last.studyid then do;

date_12mopost=intnx('year', enddate1, 1, 's');

period=1;

end;

else if not first.studyid then do;

if   enddate1>date_12mopost then do;

date_12mopost=intnx('year', enddate1, 1, 's');

period+1;

end;

end;

format     date_12mopost mmddyy10.;

run;

 

Regards,

Naveen Srinivasan

thomp7050
Pyrite | Level 9

I believe that this is what you are looking for, if I understand it correctly:

 

DATA test_data1; 
informat startdate mmddyy10. enddate mmddyy10. ;
input studyid count startdate $ enddate $;  
format startdate enddate mmddyy10.; 
datalines;
1	1	2/1/2007	5/1/2008
2	1	6/2/2012	5/24/2013
3	1	5/7/2010	8/19/2010
4	1	6/16/2007	6/21/2008
5	1	4/28/2007	11/7/2008
6	1	3/16/2007	6/20/2007
7	1	4/18/2007	4/19/2007
8	1	5/11/2007	8/3/2007
9	1	8/1/2007	8/28/2007
9	3	2/21/2008	3/5/2008
10	1	2/6/2008	3/12/2010
11	1	1/31/2007	12/26/2008
11	3	2/11/2009	5/8/2009
11	5	5/19/2009	8/14/2009
11	7	9/22/2009	12/18/2009
11	9	1/9/2010	2/3/2010
11	11	2/23/2010	11/19/2010
11	13	12/1/2010	3/18/2011
11	15	5/10/2011	4/23/2013
11	17	8/3/2013	9/27/2013
;
run;

DATA TEST_DATA1;
SET TEST_DATA1;
BY STUDYID;
FORMAT NEWMONTH MMDDYY10.;
RETAIN NEWMONTH;
IF FIRST.STUDYID THEN NEWMONTH = ENDDATE+365;
IF STARTDATE > NEWMONTH THEN DO; 
NEWMONTH = ENDDATE + 365;
END;
RUN;

 

A gentle warning: not all years have 365 days.

 

Patrick

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 914 views
  • 0 likes
  • 4 in conversation