DATA Step, Macro, Functions and more

Summing variables across continuous rows of data

Accepted Solution Solved
Reply
Contributor
Posts: 28
Accepted Solution

Summing variables across continuous rows of data

I have a dataset like this:

PatientID

DayDrugTaken

1

1

1

2

1

3

2

1

2

3

2

4


I would like to know how many continuous days in a row each patient took a drug. I used the following code

data new;

set old;

retain firstday;

by PatientId;

if first.PatientId then firstday= DayDrugTaken;

if last.PatientId then lastday= DayDrugTaken;

duration=(lastday-firstday)+1;

run;

which created a table like this:

PatientID

DayDrugTaken

FirstDay

LastDay

Duration

1

1

1

.

.

1

2

1

.

.

1

3

1

3

3

2

1

1

2

3

1

2

4

1

4

4

For patient 1, I have what I want in the “Duration” variable, 3 days. But for patient 2, I would like to have the Duration=2 because he only took the drug on days 3 and 4 continuously. Does anyone have any advice about how to calculate this in SAS?

Thank you so much in advance.


Accepted Solutions
Solution
‎10-02-2014 06:15 PM
Trusted Advisor
Posts: 1,228

Re: Summing variables across continuous rows of data

data want;
set have;
by PatientID DayDrugTaken notsorted;
dif=DayDrugTaken-lag(DayDrugTaken);
if first.PatientID or dif>1 then grp+1;
run;

proc sql;
create table final as
select PatientID, max(day) as continuous_days from (
select PatientID,grp,count(DayDrugTaken) as day
from want
group by PatientID,grp)
group by PatientID;
quit;

View solution in original post


All Replies
Valued Guide
Posts: 2,177

Re: Summing variables across continuous rows of data

The DIF() function is useful here

Super User
Posts: 11,343

Re: Summing variables across continuous rows of data

What do you want for output if daydrugtake has values like: 1, 3,4, 6,7,8, 10,11

Contributor
Posts: 28

Re: Summing variables across continuous rows of data

In that case, I would like the duration=3 because the patient took drugs continuously for 3 days (6, 7, and 8)

Thanks.

Solution
‎10-02-2014 06:15 PM
Trusted Advisor
Posts: 1,228

Re: Summing variables across continuous rows of data

data want;
set have;
by PatientID DayDrugTaken notsorted;
dif=DayDrugTaken-lag(DayDrugTaken);
if first.PatientID or dif>1 then grp+1;
run;

proc sql;
create table final as
select PatientID, max(day) as continuous_days from (
select PatientID,grp,count(DayDrugTaken) as day
from want
group by PatientID,grp)
group by PatientID;
quit;

Contributor
Posts: 28

Re: Summing variables across continuous rows of data

Worked perfectly, I can't thank you enough!

Respected Advisor
Posts: 3,156

Re: Summing variables across continuous rows of data

Seems to me there is an one-step solution:

data want;

set have;

by PatientID DayDrugTaken ;

retain max_ct;

ct+(dif(DayDrugTaken)=1);

if first.PatientID or

dif(DayDrugTaken)>1 then ct=1;

max_ct=max(max_ct,ct);

if last.patientid then do; output;max_ct=1;end;

drop ct;

run;

Valued Guide
Posts: 2,177

Re: Summing variables across continuous rows of data

The approach can be reduced because the output is at ID level- then the DOW loop is effective

DATA persistency ;

   KEEP  patientID persistence ;

DO pd= 1 BY 1 UNTIL( last.patientID ) ;

      SET old ;

      BY patientID ;

      difd = DIF( dayDrugTaken) ;

      IF  pd EQ 1 OR difd NE 1 THEN ct= 1 ;

      ELSE ct=ct+1 ;

      persistence = MAX( persistence, ct );

END;

RUN ;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 330 views
  • 0 likes
  • 5 in conversation