Solved: Re: Recovery and Time to Recovery

Tpham · Posted 12-10-2018 05:56 PM

I am having trouble to figure out how to compute these two Variables (Want: Recovered and starting recovery day)

HEre is a sample Data. (ID=3 Day 1 should be N and not M). Yes/No is coded as 1/0 with a format.

So basically, a subject is considered recovered if they it's a Y from one point to the end of the study (Day 45). So in the example above, ID=1 recovered started on the 3rd day and ID=2 on the 4th). While ID=3 didn't recover as they didn't recover day 5 and 6. THere are different patterns, basically, where subjects can recover and unrecover.

definition of recovery: Yes at one timepoint and Yes at all future time points from that first time point examined (so up to Day 45).

I have been trying to think of an easy way to code this, however, the only way I can think about it check every time frame for Y (e.g. day 44-45, then day 43-45.. then 42-45 etc.)

I am sure there is an easier way to do this, but I can't figure it out. I am hoping someone here can help me.

The next part, What is the starting time point for recovery?

data have;
input ID Day1 Day2 Day3 Day4 Day5 Day6;
datalines;
1	0	0	1	1	1	1
2	0	1	0	1	1	1
3	0	1	1	1	0	0
;

novinosrin · Posted 12-10-2018 06:17 PM

Thank you @Tpham

data have;
input ID Day1 Day2 Day3 Day4 Day5 Day6;
datalines;
1	0	0	1	1	1	1
2	0	1	0	1	1	1
3	0	1	1	1	0	0
;

data want;
set have;
array d(*) day6-day1;
if coalesce(of d(*)) then do;
Recovered='YES';
startday=dim(d)-whichn(0,of d(*))+2;
end;
else Recovered='NO';
run;

Going forward, Kindly post in the form of datastep/plain text always. Thank you for honoring the request

View solution in original post

novinosrin · Posted 12-10-2018 06:03 PM

Instead of pic, may i request you to post either in the form of a datastep or plain text to make it interesting for people like to me to easily copy/paste and test

Tpham · Posted 12-10-2018 06:06 PM

Sorry about that. It's posted.

data have;
input ID Day1 Day2 Day3 Day4 Day5 Day6;
datalines;
1	0	0	1	1	1	1
2	0	1	0	1	1	1
3	0	1	1	1	0	0
;

novinosrin · Posted 12-10-2018 06:17 PM

Thank you @Tpham

data have;
input ID Day1 Day2 Day3 Day4 Day5 Day6;
datalines;
1	0	0	1	1	1	1
2	0	1	0	1	1	1
3	0	1	1	1	0	0
;

data want;
set have;
array d(*) day6-day1;
if coalesce(of d(*)) then do;
Recovered='YES';
startday=dim(d)-whichn(0,of d(*))+2;
end;
else Recovered='NO';
run;

Going forward, Kindly post in the form of datastep/plain text always. Thank you for honoring the request

Tpham · Posted 12-10-2018 06:25 PM

Thanks for this, I am starting to breakdown to understand the functions you used and looking them up, as I am not familiar with these functions.

I assuming as with other functions, it would ignore missing data. (missing data is ok as long as the pattern stands)

novinosrin · Posted 12-10-2018 06:26 PM

Missing data as in? Can you modify your sample plz

Tpham · Posted 12-10-2018 06:31 PM

data have;
input ID Day1 Day2 Day3 Day4 Day5 Day6;
datalines;
1	0	0	1	1	1	1
2	0	1	0	1	1	1
3	0	1	1	1	0	0
4	0	0	1	.	1	1
;

Yeah it looks like it does. As ID=4 is missing Day 4 and it returned Day 3 as the start day for recover

Also, can you explain or point me in the right direction on what coalesce(of d(*)) does. I am hoping to learn from this? From looking at your code and reading up on the documentation on this function, it looks like it return the first non-missing value. I am wondering how does the of d(*) indicates the 1?

novinosrin · Posted 12-10-2018 06:35 PM

Hi @Tpham It's actually simple, and I will leave some NOTES shortly once i get home. Right now, I am on the run to catch the CTA Train to get back home. Cya soon!

novinosrin · Posted 12-10-2018 09:57 PM

Hi @Tpham Sorry about the delay:

Here is the logic:

1. First off, I reverse the array grouping from right most to left most.This is to make it convenient to idenitify the last non missing value, reversing it makes it easy to identify latest non missing value.

2. coalesce(of d(*)) results in a boolean expression true(1) if found or false(0 or missing if not found)

3. If found, I basically calculate the position by subtracting from dim i.e total number of elements in array group minus the position at which the last 0 was found, and determing the first position of 1 the follows last 0, that's in essence startingday.

4. Thhe reversal of array is the charm of this solution and everything else is plain math.

5. I've been exposed to many of this kind, so it's easy and I'm sure with time you'll get the speed & intuition.

HTH & Have fun!

Tpham · Posted 12-14-2018 05:28 PM

Thank you for explaining all this. This was a great learning opportunity!!!

PeterClemmensen · Posted 12-10-2018 06:20 PM

data have;
input ID Day1 Day2 Day3 Day4 Day5 Day6;
datalines;
1	0	0	1	1	1	1
2	0	1	0	1	1	1
3	0	1	1	1	0	0
;

data want(drop=i);
   set have;
   array days{*} Day1-Day6;
   Recovered="Yes";
   if Day6=0 then do;
      Recovered="No";
      startingday=.;
   end;
   else do i=dim(days) to 1 by -1;
      if days[i]=0 then do;
         startingday=i+1;
         return;
      end;
   end;
run;

The DATA to DATA Step Macro
Blog: SASnrd

SAS Innovate 2025: Call for Content