Hi all. Sorry about the length of the post but I wanted to be comprehensive in terms of context. I have data of exits from an employment program over the course of a year. In survival analysis terms, the 'failure' variable is leaving the program (which is a positive thing as it means the individual has found employment). There are two groups I wish to compare in terms of 'failures'. Some people are in Group A, which is a 'standard' program over the course of a year. Some people are in Group B, which is the 'standard' program for the first six months, and then at the six month mark Group B is subject to additional participation requirements. It is been found before that when people are subject to additional participation requirements (Group B), they leave the program more quickly around the time the additional requirements are imposed, compared to people without the additional requirements (Group A). Individuals know when the additional requirements are coming up, so they might leave shortly before the additional requirements are imposed or shortly after. Therefore, when comparing the groups, I would expect similar numbers of exits for the first 5 months or so, but then I would expect increased exits for a few weeks for Group B. In other words, I expect an interaction of time (measured in fortnights) and Group and that the hazard ratio will not be constant over time. The two groups are also not random but quasi-experimental, so there are additional covariates (such as participant gender, age, local unemployment rate, etc) that are included. I'm not directly interested in these variables; I just want to hold them constant between the groups. The particular problems I'm having are the following: Unlike PROC LIFETEST, no hazard rate is produced by PROC PHREG and SAS won't do it. I'd like to know the percentage that leave each week, with the population at the beginning of the week as the base. I can calculate this 'manually' but the fact that SAS doesn't produce it seems like perhaps it's an invalid/incoherent thing to ask for?? when I look at the survival graph and the other output, the expected 'spike' in exits (of Group B relative to Group A) does not seem to be reflected (whereas in the raw data, the spike in exits for Group B around the six month mark is obvious). I thought that my specification of the interaction variable would be sufficient to allow the hazard ratio to vary at each fortnight - but the output seems to 'smooth' the hazard ratio over the graph - and both groups now appear to have a 'spike' instead of Group B specifically. I can get SAS to produce hazard ratios at each fortnight of the interaction but again the hazard rate I calculate from each group does not seem to relate to the survival function as produced by SAS PHREG I want to compare the survival/failure lines of the two groups so I am using DIRADJ to adjust the covariates to equalize the groups. I do not want to use reference groups, as there are many class variables (more than shown here) and none of them are particularly 'representative' of the caseload. Is DIRADJ doing what I want by equalizing the GROUPS as if each contained the 'average' population of both groups? This is the SAS code I am using for the phreg (I have dropped some of the covariates for clarity of reading, and I've perturbed the data from any output). The data are in long format, one row per fortnight per participant until they leave the program if they leave within a year, right-censored at a year (26 fortnights) if they do not leave the program, I have included indicative output. ods graphics on;
proc phreg data=exits plots(overlay=stratum)=(survival);
class GROUP GENDER REMOTENESS fortnight /param=ref ref=first order=internal;
model (start,stop)*exit(0) = GROUP AGE GENDER REMOTENESS LAST_SCORE fortnight * GROUP / ties=efron alpha=0.01 rl;
baseline out=exit_out survival=_all_ /diradj group=GROUP;
hazardratio GROUP / at (fortnight=ALL) alpha=0.01 ;
run;
ods graphics off; Hazard Ratios for GROUP Description Point Estimate GROUP GROUP_B vs GROUP_A At fortnight=1 1.090 GROUP GROUP_B vs GROUP_A At fortnight=2 1.258 GROUP GROUP_B vs GROUP_A At fortnight=3 1.240 GROUP GROUP_B vs GROUP_A At fortnight=4 1.155 GROUP GROUP_B vs GROUP_A At fortnight=5 1.075 GROUP GROUP_B vs GROUP_A At fortnight=6 1.117 GROUP GROUP_B vs GROUP_A At fortnight=7 0.942 GROUP GROUP_B vs GROUP_A At fortnight=8 0.941 GROUP GROUP_B vs GROUP_A At fortnight=9 1.006 GROUP GROUP_B vs GROUP_A At fortnight=10 0.962 GROUP GROUP_B vs GROUP_A At fortnight=11 0.977 GROUP GROUP_B vs GROUP_A At fortnight=12 1.173 GROUP GROUP_B vs GROUP_A At fortnight=13 1.310 GROUP GROUP_B vs GROUP_A At fortnight=14 1.347 GROUP GROUP_B vs GROUP_A At fortnight=15 1.322 GROUP GROUP_B vs GROUP_A At fortnight=16 1.381 GROUP GROUP_B vs GROUP_A At fortnight=17 1.389 GROUP GROUP_B vs GROUP_A At fortnight=18 1.144 GROUP GROUP_B vs GROUP_A At fortnight=19 1.206 GROUP GROUP_B vs GROUP_A At fortnight=20 1.072 GROUP GROUP_B vs GROUP_A At fortnight=21 1.196 GROUP GROUP_B vs GROUP_A At fortnight=22 1.117 GROUP GROUP_B vs GROUP_A At fortnight=23 0.906 GROUP GROUP_B vs GROUP_A At fortnight=24 1.083 GROUP GROUP_B vs GROUP_A At fortnight=25 0.806 GROUP GROUP_B vs GROUP_A At fortnight=26 0.761 GROUP GROUP_B vs GROUP_A At fortnight=27 0.729 The graphs below are not the original data but they illustrate the kind of thing happening to the original. Below on the right is a graph of 'raw' exits showing the percentage leaving each fortnight, with the denominator being the number of people left at the beginning of the fortnight. The two lines are Group A and B. Below on the left is a graph, plotted by taking the fortnightly 'survival' rate from phreg output, turning it into a failure rate (1-survival), and then calculating a 'hazard' rate each fortnight for each group. You can see this hazard ratio between the groups is essentially constant. Is my interaction variable wrongly specified? I want the hazard ratio between the groups to be free to vary at each fortnight.
... View more