A longitudinal clinical trial was conducted to examine the effectiveness of an experimental treatment in preventing disease progression. Subjects identified at an early stage of the disease are entered into the study and randomized to receive either standard treatment (the control group) or the experimental treatment. Subjects are scheduled for regular follow-up visits at roughly 6-month intervals. The rate of disease progression is of primary interest. Variables in the file are: 1. Patient id (1-50) 2. Treatment group ( 0 = control group, and 1 = experimental group) 3. Visit number, numbered consecutively for each subject 4. Time since last visit (months, with 0 at the first visit) 5. Stage of disease (0 = early stage, and 1 = after disease progression) All subjects are in the early stage of disease at the first visit. Subjects remain in the study after disease progression, so there may be several records from a study patient after disease progression. The data are sorted by study visit, and then by patient id within study visit. There are no missing data. 1. Keeping data in a “one row for each visit” format (with multiple observations per patient) create a variable giving the number of months between the entry into the study for that subject and that study visit. (This will give the total number of months on study at each visit). PROC PRINT (e.g., the first 20 observations) a partial listing of the data to make sure that this new variable was created correctly. 2. For a study like this, we want to report on the number of subjects in each group and how many total visits that there were per group. How many subjects and how many visits are there for each treatment group? (Hint: use PROC MEANS with two new created variables. The sum of 1s and 0s at the subject level is equal to the number of subjects. Also, what would the sum of 1s represent if the value, 1, is assigned for each observation?) 3. For analysis, we would now like a new data set with only one row per subject. As part of this data set, we want a variable that gives the length of time that the subject was in the study up to and including the visit when disease progression is first noted (stage=1 if observed). For patients whose disease did not progress during the study (all observations with stage=0), however. this would be their total time from visit 1 until the last visit for that patient. (Hint: to create this variable, you should use the variable created in part 1., and use the first. and last. commands. This is tricky and you may need to try this a few different ways before you find one that works. To be sure that you have it right, check your results on a few subjects who progress and a few who do not.) For the one observation for each patient, only keep the id number, treatment group, follow-up time, and whether disease has progressed or not. My code to solve all questions for question1 filename in "E:\longdata.dat"; data temp; infile in; input pt_id trt_grp visits time stage ; run; proc sort data = temp;by pt_id;run; data temp1; set temp; by pt_id; if first.pt_id then totmnth=0; totmnth+time; run; for question 2 proc print data=temp1 (obs=20);run; proc sort data=temp1;by trt_grp;run; data temp1; set temp; by pt_id; if first.pt_id then totmnth=0; totmnth+time; if first.pt_id then id=1; else id=0; run; proc means data=temp1 noprint; by trt_grp; var id visits; output out=one sum(id)=totalsubj n= visits; run; for question 3 data temp2; set temp1 ; by pt_id ; retain lngth; if first.pt_id then lngth=0; lngth+totmnth; if last.pt_id then output; run; now in question three the code which I have used is somewhat wrong according to me, because it is not fullfilling whats asked in the question regarding the stage condition and I am not able to figure oiut how to fulfill the stage condition for question 3
... View more