BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kels123
Quartz | Level 8

I have a database where subjects have repeated measurements. These measurements are taken during specific time periods delineated by StartDate and EndDate1. DAYSBWN1 is the length of time of each period (EndDate1 – StartDate in each row). Sample code below:

 

data tempfile;

infile datalines truncover;

INPUT

StudyID$2.@+1 StartDate mmddyy10.@+1 EndDate1 mmddyy10.@+1 daysbwn1 ; 

format Startdate EndDate1 mmddyy10.;

datalines;

01 02/01/2007 05/01/2008 455 

02 06/02/2012 05/24/2013 356 

03 05/07/2010 08/19/2010 104 

04 06/16/2007 06/21/2008 371 

05 04/28/2007 11/07/2008 559 

06 03/16/2007 06/20/2007 96 

07 04/18/2007 04/19/2007 1 

08 05/11/2007 08/03/2007 84 

09 08/01/2007 08/28/2007 27 

09 02/21/2008 03/05/2008 13 

10 02/06/2008 03/12/2010 765 

11 01/31/2007 12/26/2008 695 

11 02/11/2009 05/08/2009 86 

11 05/19/2009 08/14/2009 87 

11 09/22/2009 12/18/2009 87 

11 01/09/2010 02/03/2010 25 

11 02/23/2010 11/19/2010 269 

11 12/01/2010 03/18/2011 107 

11 05/10/2011 04/23/2013 714 

11 08/03/2013 09/27/2013 55 

;

run;

proc print data=tempfile; run;

 

 

I would like to write a code that creates two new variables:

DAYSBWN_CUM: The total number of time for all follow-up periods aka repeat rows (DAYSBWN1 added up for any rows that follow).

REPEAT_NUM: The total number of times the subject has a repeat row (REPEAT_NUM is > 1 only if the subject has more than one row).

 

The goal would be to get something that looks like this:

 

StudyID

StartDate

EndDate1

daysbwn1

DAYSBWN_CUM

REPEAT_NUM

01

02/01/2007

05/01/2008

455

0

0

02

06/02/2012

05/24/2013

356

0

0

03

05/07/2010

08/19/2010

104

0

0

04

06/16/2007

06/21/2008

371

0

0

05

04/28/2007

11/07/2008

559

0

0

06

03/16/2007

06/20/2007

96

0

0

07

04/18/2007

04/19/2007

1

0

0

08

05/11/2007

08/03/2007

84

0

0

09

08/01/2007

08/28/2007

27

13

1

09

02/21/2008

03/05/2008

13

0

0

10

02/06/2008

03/12/2010

765

0

0

11

01/31/2007

12/26/2008

695

1430

8

11

02/11/2009

05/08/2009

86

1344

7

11

05/19/2009

08/14/2009

87

1257

6

11

09/22/2009

12/18/2009

87

1170

5

11

01/09/2010

02/03/2010

25

1145

4

11

02/23/2010

11/19/2010

269

876

3

11

12/01/2010

03/18/2011

107

769

2

11

05/10/2011

04/23/2013

714

55

1

11

08/03/2013

09/27/2013

55

0

0

 

If the coding would be significantly easier, I could work with something like this instead, because I could just subtract certain values from the rows to get my desired final product:

 

StudyID

StartDate

EndDate1

daysbwn1

DAYSBWN_CUM

REPEAT_NUM

01

02/01/2007

05/01/2008

455

455

1

02

06/02/2012

05/24/2013

356

356

1

03

05/07/2010

08/19/2010

104

104

1

04

06/16/2007

06/21/2008

371

371

1

05

04/28/2007

11/07/2008

559

559

1

06

03/16/2007

06/20/2007

96

96

1

07

04/18/2007

04/19/2007

1

1

1

08

05/11/2007

08/03/2007

84

84

1

09

08/01/2007

08/28/2007

27

40

2

09

02/21/2008

03/05/2008

13

13

1

10

02/06/2008

03/12/2010

765

765

1

11

01/31/2007

12/26/2008

695

2125

9

11

02/11/2009

05/08/2009

86

1430

8

11

05/19/2009

08/14/2009

87

1344

7

11

09/22/2009

12/18/2009

87

1257

6

11

01/09/2010

02/03/2010

25

1170

5

11

02/23/2010

11/19/2010

269

1145

4

11

12/01/2010

03/18/2011

107

876

3

11

05/10/2011

04/23/2013

714

769

2

11

08/03/2013

09/27/2013

55

55

1

 

Any assistance would be greatly appreciated. Thank you in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

One approach:

 

data want;

daysbwn_cum=0;

repeat_num=0;

do until (last.id);

   set have;

   by id;

   daysbwn_cum + daysbwn1;

   repeat_num + 1;

end;

do until (last.id);

   set have;

   by id;

   daysbwn_cum = daysbwn_cum - daysbwn1;

   repeat_num = repeat_num - 1;

   output;

end;

run;

 

The top loop gets the totals, then the bottom loop removes the current values and outputs the results.

View solution in original post

2 REPLIES 2
Astounding
PROC Star

One approach:

 

data want;

daysbwn_cum=0;

repeat_num=0;

do until (last.id);

   set have;

   by id;

   daysbwn_cum + daysbwn1;

   repeat_num + 1;

end;

do until (last.id);

   set have;

   by id;

   daysbwn_cum = daysbwn_cum - daysbwn1;

   repeat_num = repeat_num - 1;

   output;

end;

run;

 

The top loop gets the totals, then the bottom loop removes the current values and outputs the results.

art297
Opal | Level 21
proc sort data=tempfile;
  by StudyID descending StartDate;
run;

data tempfile;
  set tempfile;
  by StudyID;
  if first.StudyID then do;
    DAYSBWN_CUM=0;
    REPEAT_NUM=0;
    output;
    DAYSBWN_CUM=daysbwn1;
  end;
  else do;
    REPEAT_NUM+1;
    output;
    DAYSBWN_CUM+daysbwn1;
  end;
run;
proc sort data=tempfile;
  by StudyID StartDate;
run;

Art, CEO, AnalystFinder.com

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 2455 views
  • 1 like
  • 3 in conversation