BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
anissak1
Obsidian | Level 7

Hello.  I'm working on a clinical trial dataset and am soliciting help as I try to analyze the data.   I've done a fair amount with my (rudimentary) SAS skills, but have hit a few roadblocks.  From 2465 patients in several clinical trials, I have a measure for a walk of time in seconds ( ~48000 observations).  This measure is done twice per visit (order is not consequential) and different patients have different numbers of visits.  The 2 outcomes at each visit should be averaged to create a single value.  Abbreviated, and simple, dataset attached: USUBJID, visit date and walk time.  I am seeking:

 

1) A dataset that has only the average time for each patient for each visit and the date for the visit.  Long form.  I have created average values for the baseline visit, but can't figure out how to do this across all visits.

2) Variables indicating whether the walk time increases for each patient, the amount of increase, and the date this occurred.  Broad form - one record by USUBJID.  Would be represented by an increase at any visit after baseline that is greater than 20% compared to baseline.  If it happens multiple times, I only need the first occurrence.  Again, since the dates for visits are all different for each patient, I am not sure how to approach.

3) To indicate a confirmed increase in walk times for each USUBJID.  Broad form.  This gets a bit complicated.  This variable would be "positive" if there is an increase at any visit after baseline that is greater than 20% compared to baseline AND is sustained.  Being sustained means that it is confirmed to be equal than or greater than the increased value >= 3 months after the initial increase and that there are no decreases during interval visits.  

4) Isolate the last visit outcome (average of two measures) for each patient and the associated date of the last visit.

 

I know this is a lot, so if anyone wants to help me tackle any specific aspect, I'd be so very appreciative!  So much learning...:)  Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

1) A dataset that has only the average time for each patient for each visit and the date for the visit.  Long form.  I have created average values for the baseline visit, but can't figure out how to do this across all visits.

Summary for EACH visit or "across all visits"? Not typically the same thing.

This would give a mean for each visit.

proc summary data=have nway;
   class USUBJID studyday/ missing  ;
   var walktime;
   output out=work.daysummary (drop= _type_ _freq_) mean=;
run;

Also, I looked at your data and your visit information seems to be "study day" but it has missing values. Since different missing values cannot be differentiated in general if you have more than one visit per subject they will all be averaged together with the above code. This is also going to affect your question about "increases".

It might help to show an example of what you think that data set might look like.

 

A relatively robust way to determine change in walk time over the study days would be a regression of some sort.

 

Show us what you attempt with the summarized data using an approach like above.

There are lots of examples on this forum of using a data step to compare sequences of values to the FIRST value for an identifier. (Hint: By group and First.usubjid ).

Show us what you try for your second bit and third bits. They would be done in the same data step I think.

 

Since you are missing studyday for some you may have issues with your #4. Your example data for the first subject seems to imply that the missing day is the last day. But the same data step may do it once you get the study day issue resolved.

 

Here is a quicky example of first and last processing and how to keep a value available from the first occurrence of an identifier.

proc sort data=sashelp.class
     out=work.class;
   by sex age;
run;

data work.example;
   set work.class;
   by sex;
   retain firstage;
   if first.sex then do;
      firstflag= catx('_','First',sex);
      firstage = age;
   end;

   if last.sex then Lastflag = catx('_','Last',sex);
run;

It should be trivial to compare the firstage to the age of any given record in the above example. You should have the SASHELP.CLASS data set available for experimentation, and it is small so easy to examine results.

 

 

View solution in original post

3 REPLIES 3
ballardw
Super User

1) A dataset that has only the average time for each patient for each visit and the date for the visit.  Long form.  I have created average values for the baseline visit, but can't figure out how to do this across all visits.

Summary for EACH visit or "across all visits"? Not typically the same thing.

This would give a mean for each visit.

proc summary data=have nway;
   class USUBJID studyday/ missing  ;
   var walktime;
   output out=work.daysummary (drop= _type_ _freq_) mean=;
run;

Also, I looked at your data and your visit information seems to be "study day" but it has missing values. Since different missing values cannot be differentiated in general if you have more than one visit per subject they will all be averaged together with the above code. This is also going to affect your question about "increases".

It might help to show an example of what you think that data set might look like.

 

A relatively robust way to determine change in walk time over the study days would be a regression of some sort.

 

Show us what you attempt with the summarized data using an approach like above.

There are lots of examples on this forum of using a data step to compare sequences of values to the FIRST value for an identifier. (Hint: By group and First.usubjid ).

Show us what you try for your second bit and third bits. They would be done in the same data step I think.

 

Since you are missing studyday for some you may have issues with your #4. Your example data for the first subject seems to imply that the missing day is the last day. But the same data step may do it once you get the study day issue resolved.

 

Here is a quicky example of first and last processing and how to keep a value available from the first occurrence of an identifier.

proc sort data=sashelp.class
     out=work.class;
   by sex age;
run;

data work.example;
   set work.class;
   by sex;
   retain firstage;
   if first.sex then do;
      firstflag= catx('_','First',sex);
      firstage = age;
   end;

   if last.sex then Lastflag = catx('_','Last',sex);
run;

It should be trivial to compare the firstage to the age of any given record in the above example. You should have the SASHELP.CLASS data set available for experimentation, and it is small so easy to examine results.

 

 

anissak1
Obsidian | Level 7
Thank you so much for the guidance! Some comments back.

I will eliminate missing dates. 1700/48000. I already was able to impute 4000 dates but am at my limit for those remaining.

I will play with what guidance you have provided and will get back to you/the community based on your suggestions. Probably will take me a day or so.

As to a regression, I will run one, but I also need an indicator variable for increases as I am doing a survival analysis (on a different variable) that needs two populations of people based on whether or not their walk times increased. And this approach of measuring increases is a conventional way to define it (though it may not be the most statistically efficient way). 🙂

Thanks again! I feel encouraged to proceed. 🙂

Anissa
anissak1
Obsidian | Level 7

Thank you!  One and four are done!  Will try to tackle 2 and 3 today. Will advise on any roadblocks I hit.

Am I posting correctly and should I reply to this message thread or start a new post?  As I'm new to the community, and and all advice is welcome. 🙂

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1804 views
  • 1 like
  • 2 in conversation