DATA Step, Macro, Functions and more

Counting observations within a by group based on specific criteria

Accepted Solution Solved
Reply
New Contributor Gib
New Contributor
Posts: 4
Accepted Solution

Counting observations within a by group based on specific criteria

Hi,

 

I have the following dataset, where the main interests are ID (refers to patient ID number) and date_start (refers to start day of treatment):

 

Row

id

atc

date_start

date_stop

1

1

A

2010-07-02

2011-08-17

2

1

B

2011-08-17

2011-09-09

3

1

C

2011-09-09

2011-11-10

4

1

D

2011-11-14

2012-02-08

5

2

E

2007-03-17

2009-07-01

6

2

F

2007-06-29

2009-11-12

7

2

A

2007-07-17

2010-06-22

8

2

B

2007-08-14

2010-11-04

9

2

A

2007-09-04

2011-02-17

10

2

D

2008-02-18

2011-06-20

11

2

F

2008-03-21

2011-08-19

12

2

A

2008-04-15

2011-09-28

13

2

B

2008-04-29

2011-10-18

14

2

K

2008-05-15

2011-12-01

 

I would like to create a new numeric variable called LINE (corresponds to the line of treatment). The LINE should be counted within the group of ID, but the number of lines are only counted if the date_start is higher than the first date_start + 60 within the LINE variable.

 

This is the desired output:

 

To exemplify what I would like to achieve please note: in row number 10-12 the line number is 4 because both the row 11 and 12 is less then date_start+60 of the row 10. But in row 13 the line is 5 because its date_start is more than date_start+60 of the row 10.

 

Row

id

atc

date_start

date_stop

Line

1

1

A

2010-07-02

2011-08-17

1

2

1

B

2011-08-17

2011-09-09

2

3

1

C

2011-09-09

2011-11-10

2

4

1

D

2011-11-14

2012-02-08

3

5

2

E

2007-03-17

2009-07-01

1

6

2

F

2007-06-29

2009-11-12

2

7

2

A

2007-07-17

2010-06-22

2

8

2

B

2007-08-14

2010-11-04

2

9

2

A

2007-09-04

2011-02-17

3

10

2

D

2008-02-18

2011-06-20

4

11

2

F

2008-03-21

2011-08-19

4

12

2

A

2008-04-15

2011-09-28

4

13

2

B

2008-04-29

2011-10-18

5

14

2

K

2008-05-15

2011-12-01

5

 

Do you have any suggestion how this can be done?

 

SAS 9.3

 

Thanks,
Gib


Accepted Solutions
Solution
‎02-08-2018 11:38 AM
Super User
Posts: 6,628

Re: Counting observations within a by group based on specific criteria

A similar approach, but with a slightly different interpretation of the requirements:

 

data want;

set have;

by id;

if first.id then do;

   line=1;

   baseline_date = date_start;

end;

else if date_start - baseline_date > 60 then do;

   line + 1;

   baseline_date = date_start;

end;

retain baseline_date;

drop baseline_date;

run;

View solution in original post


All Replies
Super User
Posts: 13,295

Re: Counting observations within a by group based on specific criteria

I have to guess by +60 you mean 60 days. since 60 could be weeks, months, years, ...

/* assumes your data set is sorted by Id and date_start
   AND that date_start is a SAS date value and not character
*/
data want;
   set have;
   by id;
   retain line .;
   lds = lag(date_start);
   if first.id then line=1;
   else if start_date > (lds+60) then line+1;
drop lds; run;

If your dates aren't numeric SAS date values then you need to create some.

 

New Contributor Gib
New Contributor
Posts: 4

Re: Counting observations within a by group based on specific criteria

[ Edited ]

Hi,

Thanks for your reply. Yes, you assumed correctly, I meant 60 days by 60.

 

The suggested code does not work unfortunately, because it only takes into account the previous observation with the lag function. The code gives a value of 4 of LINE variable in the row 13, when it should be 5 because when comparing the date_start + 60 the reference should be the first date_start of the LINE variable.

 

Do you have an other suggestion maybe?

 

Row

id

atc

date_start

date_stop

Line

1

1

A

2010-07-02

2011-08-17

1

2

1

B

2011-08-17

2011-09-09

2

3

1

C

2011-09-09

2011-11-10

2

4

1

D

2011-11-14

2012-02-08

3

5

2

E

2007-03-17

2009-07-01

1

6

2

F

2007-06-29

2009-11-12

2

7

2

A

2007-07-17

2010-06-22

2

8

2

B

2007-08-14

2010-11-04

2

9

2

A

2007-09-04

2011-02-17

3

10

2

D

2008-02-18

2011-06-20

4

11

2

F

2008-03-21

2011-08-19

4

12

2

A

2008-04-15

2011-09-28

4

13

2

B

2008-04-29

2011-10-18

5

14

2

K

2008-05-15

2011-12-01

5

 

Thanks,
Gib

Solution
‎02-08-2018 11:38 AM
Super User
Posts: 6,628

Re: Counting observations within a by group based on specific criteria

A similar approach, but with a slightly different interpretation of the requirements:

 

data want;

set have;

by id;

if first.id then do;

   line=1;

   baseline_date = date_start;

end;

else if date_start - baseline_date > 60 then do;

   line + 1;

   baseline_date = date_start;

end;

retain baseline_date;

drop baseline_date;

run;

New Contributor Gib
New Contributor
Posts: 4

Re: Counting observations within a by group based on specific criteria

Posted in reply to Astounding

To be honest I do not understand the logic behind the code but it works as desired, thank you!


Super User
Posts: 6,628

Re: Counting observations within a by group based on specific criteria

[ Edited ]

All of the techniques and tools used here are frequently used, and well worth learning if you don't know them:

 

The effect of a BY statement within a DATA step

RETAIN statement

DROP statement

How SAS stores dates

Sum statement (such as line + 1; )

Contributor VDD
Contributor
Posts: 22

Re: Counting observations within a by group based on specific criteria

Posted in reply to Astounding
these are the first steps in any programming language.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 198 views
  • 3 likes
  • 4 in conversation