Solved: Re: Clinical Trial Data Dataset and Variables

anissak1 · Posted 03-18-2020 12:17 PM

Hello. I'm working on a clinical trial dataset and am soliciting help as I try to analyze the data. I've done a fair amount with my (rudimentary) SAS skills, but have hit a few roadblocks. From 2465 patients in several clinical trials, I have a measure for a walk of time in seconds ( ~48000 observations). This measure is done twice per visit (order is not consequential) and different patients have different numbers of visits. The 2 outcomes at each visit should be averaged to create a single value. Abbreviated, and simple, dataset attached: USUBJID, visit date and walk time. I am seeking:

1) A dataset that has only the average time for each patient for each visit and the date for the visit. Long form. I have created average values for the baseline visit, but can't figure out how to do this across all visits.

2) Variables indicating whether the walk time increases for each patient, the amount of increase, and the date this occurred. Broad form - one record by USUBJID. Would be represented by an increase at any visit after baseline that is greater than 20% compared to baseline. If it happens multiple times, I only need the first occurrence. Again, since the dates for visits are all different for each patient, I am not sure how to approach.

3) To indicate a confirmed increase in walk times for each USUBJID. Broad form. This gets a bit complicated. This variable would be "positive" if there is an increase at any visit after baseline that is greater than 20% compared to baseline AND is sustained. Being sustained means that it is confirmed to be equal than or greater than the increased value >= 3 months after the initial increase and that there are no decreases during interval visits.

4) Isolate the last visit outcome (average of two measures) for each patient and the associated date of the last visit.

I know this is a lot, so if anyone wants to help me tackle any specific aspect, I'd be so very appreciative! So much learning...:) Thank you.

ballardw · Posted 03-18-2020 12:48 PM

1) A dataset that has only the average time for each patient for each visit and the date for the visit. Long form. I have created average values for the baseline visit, but can't figure out how to do this across all visits.

Summary for EACH visit or "across all visits"? Not typically the same thing.

This would give a mean for each visit.

proc summary data=have nway;
   class USUBJID studyday/ missing  ;
   var walktime;
   output out=work.daysummary (drop= _type_ _freq_) mean=;
run;

Also, I looked at your data and your visit information seems to be "study day" but it has missing values. Since different missing values cannot be differentiated in general if you have more than one visit per subject they will all be averaged together with the above code. This is also going to affect your question about "increases".

It might help to show an example of what you think that data set might look like.

A relatively robust way to determine change in walk time over the study days would be a regression of some sort.

Show us what you attempt with the summarized data using an approach like above.

There are lots of examples on this forum of using a data step to compare sequences of values to the FIRST value for an identifier. (Hint: By group and First.usubjid ).

Show us what you try for your second bit and third bits. They would be done in the same data step I think.

Since you are missing studyday for some you may have issues with your #4. Your example data for the first subject seems to imply that the missing day is the last day. But the same data step may do it once you get the study day issue resolved.

Here is a quicky example of first and last processing and how to keep a value available from the first occurrence of an identifier.

proc sort data=sashelp.class
     out=work.class;
   by sex age;
run;

data work.example;
   set work.class;
   by sex;
   retain firstage;
   if first.sex then do;
      firstflag= catx('_','First',sex);
      firstage = age;
   end;

   if last.sex then Lastflag = catx('_','Last',sex);
run;

It should be trivial to compare the firstage to the age of any given record in the above example. You should have the SASHELP.CLASS data set available for experimentation, and it is small so easy to examine results.

View solution in original post

ballardw · Posted 03-18-2020 12:48 PM

1) A dataset that has only the average time for each patient for each visit and the date for the visit. Long form. I have created average values for the baseline visit, but can't figure out how to do this across all visits.

Summary for EACH visit or "across all visits"? Not typically the same thing.

This would give a mean for each visit.

proc summary data=have nway;
   class USUBJID studyday/ missing  ;
   var walktime;
   output out=work.daysummary (drop= _type_ _freq_) mean=;
run;