- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I am trying to create a time-varying covariate that indicates whether someone was taking drug 1 (drugcat=0), drug 2 (drugcat=1), or drug 1+ drug 2 (drugcat=2). I want to run a Cox proportional hazards regression (proc phreg) that incorporates the fact that the drugs people are taking changes over time. Here is a simplified version of the data I have:
ID |
Drug1StartDate1 |
Drug1EndDate1 |
Drug1StartDate2 |
Drug1EndDate2 |
Drug2StartDate1 |
Drug2EndDate1 |
Drug2StartDate2 |
Drug2EndDate2 |
age |
sex |
censor |
1 |
June 1 2011 |
July 1 2011 |
|
|
|
|
|
|
50 |
0 |
0 |
2 |
Jan 1 2012 |
Dec 31 2012 |
|
|
June 1 2012 |
Dec 31 2012 |
|
|
40 |
1 |
1 |
3 |
Jan 1 2010 |
June 15 2010 |
July 1 2010 |
Dec 1 2010 |
Feb 1 2010 |
June 30 2010 |
Sept 15 2010 |
Dec 15 2010 |
60 |
1 |
0 |
I want the start and stop times in days that they are in each drug category. It’s not so hard when someone falls into only one category (ID=1) or two categories (ID=2) but it gets really tricky when people move between the categories (ID=3 is on only drug 1 from Jan 1-Jan 31, then on both drugs from Feb 1 to June 15, then on only drug 2 from June 16-30, then back to drug 1 from July 1 to Sept 14, then on both drugs from Sept 15 to Dec 1, then only on Drug 2 from Dec 2 to Dec 15. Some people have up to 20 start/stop dates so I can't really muscle through each possible variation.
Here is what I'd like my data to look like:
ID |
Start |
Stop |
Drugcat |
age |
sex |
censor |
1 |
0 |
30 |
0 |
50 |
0 |
0 |
2 |
0 |
152 |
0 |
40 |
1 |
0 |
2 |
152 |
365 |
2 |
40 |
1 |
1 |
3 |
0 |
31 |
0 |
60 |
1 |
0 |
3 |
31 |
167 |
2 |
60 |
1 |
0 |
3 |
167 |
182 |
1 |
60 |
1 |
0 |
3 |
182 |
257 |
0 |
60 |
1 |
0 |
3 |
257 |
334 |
2 |
60 |
1 |
0 |
3 |
334 |
365 |
1 |
60 |
1 |
0 |
Ultimately I’ll use this code to run my regression (where drugcat=0 means drug 1 alone, drugcat=1 means drug 2 alone and drugcat=3 means both drugs at the same time):
Proc phreg data=want;
Class drugcat;
Model (start, stop)*censor(0)=drugcat age sex/rl;
Run;
I’ve tried using arrays to get the data I want but it quickly falls apart when someone switches between the drug categories more than once.
Can anyone tell me a better way to get to my goal? Thanks in advance!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi - sounds like you need a "counting process" dataset as input for PHREG. Are you still looking for a way to create this? I have a macro for it - let me know.