BookmarkSubscribeRSS Feed
Lefty
Obsidian | Level 7

Hi, I am trying to create a time-varying covariate that indicates whether someone was taking drug 1 (drugcat=0), drug 2 (drugcat=1), or drug 1+ drug 2 (drugcat=2). I want to run a Cox proportional hazards regression (proc phreg) that incorporates the fact that the drugs people are taking changes over time. Here is a simplified version of the data I have:

 

ID

Drug1StartDate1

Drug1EndDate1

Drug1StartDate2

Drug1EndDate2

Drug2StartDate1

Drug2EndDate1

Drug2StartDate2

Drug2EndDate2

age

sex

censor

1

June 1 2011

July 1 2011

 

 

 

 

 

 

50

0

0

2

Jan 1 2012

Dec 31 2012

 

 

June 1 2012

Dec 31 2012

 

 

40

1

1

3

Jan 1 2010

June 15 2010

July 1 2010

Dec 1 2010

Feb 1 2010

June 30 2010

Sept 15 2010

Dec 15 2010

60

1

0

 

I want the start and stop times in days that they are in each drug category. It’s not so hard when someone falls into only one category (ID=1) or two categories (ID=2) but it gets really tricky when people move between the categories (ID=3 is on only drug 1 from Jan 1-Jan 31, then on both drugs from Feb 1 to June 15, then on only drug 2 from June 16-30, then back to drug 1 from July 1 to Sept 14, then on both drugs from Sept 15 to Dec 1, then only on Drug 2 from Dec 2 to Dec 15. Some people have up to 20 start/stop dates so I can't really muscle through each possible variation.
Here is what I'd like my data to look like:

ID

Start

Stop

Drugcat

age

sex

censor

1

0

30

0

50

0

0

2

0

152

0

40

1

0

2

152

365

2

40

1

1

3

0

31

0

60

1

0

3

31

167

2

60

1

0

3

167

182

1

60

1

0

3

182

257

0

60

1

0

3

257

334

2

60

1

0

3

334

365

1

60

1

0

Ultimately I’ll use this code to run my regression (where drugcat=0 means drug 1 alone, drugcat=1 means drug 2 alone and drugcat=3 means both drugs at the same time):

Proc phreg data=want;

Class drugcat;

Model (start, stop)*censor(0)=drugcat age sex/rl;

Run;

I’ve tried using arrays to get the data I want but it quickly falls apart when someone switches between the drug categories more than once.

Can anyone tell me a better way to get to my goal? Thanks in advance!

1 REPLY 1
quickbluefish
Obsidian | Level 7

Hi - sounds like you need a "counting process" dataset as input for PHREG.  Are you still looking for a way to create this?  I have a macro for it - let me know.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 374 views
  • 0 likes
  • 2 in conversation