Solved
Contributor
Posts: 31

Summing variables across continuous rows of data

I have a dataset like this:

 PatientID DayDrugTaken 1 1 1 2 1 3 2 1 2 3 2 4

I would like to know how many continuous days in a row each patient took a drug. I used the following code

data new;

set old;

retain firstday;

by PatientId;

if first.PatientId then firstday= DayDrugTaken;

if last.PatientId then lastday= DayDrugTaken;

duration=(lastday-firstday)+1;

run;

which created a table like this:

 PatientID DayDrugTaken FirstDay LastDay Duration 1 1 1 . . 1 2 1 . . 1 3 1 3 3 2 1 1 2 3 1 2 4 1 4 4

For patient 1, I have what I want in the “Duration” variable, 3 days. But for patient 2, I would like to have the Duration=2 because he only took the drug on days 3 and 4 continuously. Does anyone have any advice about how to calculate this in SAS?

Thank you so much in advance.

Accepted Solutions
Solution
‎10-02-2014 06:15 PM
Posts: 1,270

Re: Summing variables across continuous rows of data

data want;
set have;
by PatientID DayDrugTaken notsorted;
dif=DayDrugTaken-lag(DayDrugTaken);
if first.PatientID or dif>1 then grp+1;
run;

proc sql;
create table final as
select PatientID, max(day) as continuous_days from (
select PatientID,grp,count(DayDrugTaken) as day
from want
group by PatientID,grp)
group by PatientID;
quit;

All Replies
Valued Guide
Posts: 2,191

Re: Summing variables across continuous rows of data

The DIF() function is useful here

Super User
Posts: 13,523

Re: Summing variables across continuous rows of data

What do you want for output if daydrugtake has values like: 1, 3,4, 6,7,8, 10,11

Contributor
Posts: 31

Re: Summing variables across continuous rows of data

In that case, I would like the duration=3 because the patient took drugs continuously for 3 days (6, 7, and 8)

Thanks.

Solution
‎10-02-2014 06:15 PM
Posts: 1,270

Re: Summing variables across continuous rows of data

data want;
set have;
by PatientID DayDrugTaken notsorted;
dif=DayDrugTaken-lag(DayDrugTaken);
if first.PatientID or dif>1 then grp+1;
run;

proc sql;
create table final as
select PatientID, max(day) as continuous_days from (
select PatientID,grp,count(DayDrugTaken) as day
from want
group by PatientID,grp)
group by PatientID;
quit;

Contributor
Posts: 31

Re: Summing variables across continuous rows of data

Worked perfectly, I can't thank you enough!

Posts: 3,167

Re: Summing variables across continuous rows of data

Seems to me there is an one-step solution:

data want;

set have;

by PatientID DayDrugTaken ;

 retain max_ct; ct+(dif(DayDrugTaken)=1);

if first.PatientID or

dif(DayDrugTaken)>1 then ct=1;

max_ct=max(max_ct,ct);

if last.patientid then do; output;max_ct=1;end;

drop ct;

run;

Valued Guide
Posts: 2,191

Re: Summing variables across continuous rows of data

The approach can be reduced because the output is at ID level- then the DOW loop is effective

DATA persistency ;

KEEP  patientID persistence ;

DO pd= 1 BY 1 UNTIL( last.patientID ) ;

SET old ;

BY patientID ;

difd = DIF( dayDrugTaken) ;

IF  pd EQ 1 OR difd NE 1 THEN ct= 1 ;

ELSE ct=ct+1 ;

persistence = MAX( persistence, ct );

END;

RUN ;

🔒 This topic is solved and locked.

Discussion stats
• 7 replies
• 376 views
• 0 likes
• 5 in conversation