Solved: Re: How to create a new variable and how to calculate survival time fo...

nwang5 · Posted 09-09-2023 02:31 PM

Hi SAS community,

I hope you are doing well. I am currently delving into the investigation of risk factors associated with recurrent depression using a longitudinal dataset. In this dataset, I have the variables 'ID' and 'Depression9-Depression15' as original variables. My objective is to create a three-categorical outcome, which comprises 'recurrent,' 'depression,' and 'no depression.' Specifically, 'recurrent' should be defined as experiencing depression for the second time.

It's worth noting that all the participants in 'Depression9' did not experience depression. I would greatly appreciate your guidance on creating this outcome and calculating survival time in the two highlighted columns. Thank you for your assistance!

ID	Depression9	Depression10	Depression11	Depression12	Depression13	Depression14	Depression15	outcome	Survival time
1	x	√	x	√	.	.	.	Recurrent	6
2	x	x	√	x	.	√	.	Recurrent	10
3	x	x	x	x	x	x	x	No Depression	12
4	x	.	√	x	√	.	.	Recurrent	8
5	x	.	.	√	√	√	√	Depression	6
6	x	√	.	.	x	√	.	Recurrent	10
7	x	.	√	.	.	x	√	Recurrent	12
8	x	x	.	.	.	x	.	No Depression	10
9	x	.	√	√	x	.	.	Depression	4
10	x	√	x	√	√	x	√	Recurrent	6

mkeintz · Posted 09-09-2023 11:40 PM

The options in my SAS session converts the "√" character to a "v" in the program below:

data have;
  input ID (depression9-depression15) (:$1.)   expected_outcome :$&13. exp_Survtime ;
datalines;
1    x    √    x    √    .    .    .    Recurrent       6
2    x    x    √    x    .    √    .    Recurrent      10
3    x    x    x    x    x    x    x    No Depression  12
4    x    .    √    x    √    .    .    Recurrent       8
5    x    .    .    √    √    √    √    Depression      6
6    x    √    .    .    x    √    .    Recurrent      10
7    x    .    √    .    .    x    √    Recurrent      12
8    x    x    .    .    .    x    .    No Depression  10
9    x    .    √    √    x    .    .    Depression      4
10   x    √    x    √    √    x    √    Recurrent       6
run;

data want (drop=_: d);
  set have;

  array dep {*} depression: ;

  _string=cat(of depression10-depression15);
  _npositive=lengthn(compress(_string,'x '));

  if _npositive<=1 then do;
    outcome='No Depression';
    survtime=2*(length(_string));
  end;
  else if countw(_string,'x ')=1 then do;
    outcome='Depression';
    survtime=2*findc(_string,'v');
  end;
  else do;
    outcome='Recurrent';
    do d=1 by 1 until (dep{d}='v'), d+1 to dim(dep) until (dep{d}='v' and dep{d-1}^='v');
    end;
    survtime=2*(d-1);
  end;
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

PeterClemmensen · Posted 09-09-2023 03:20 PM

How is survival time defined?

The DATA to DATA Step Macro
Blog: SASnrd

nwang5 · Posted 09-09-2023 03:26 PM

Survival time is defined as the occurrence of recurrent depression, specifically the second instance of depression

PeterClemmensen · Posted 09-09-2023 03:42 PM

I still don't get it. Why is survival time = 6 in the first obs? And 10 in the second?

The DATA to DATA Step Macro
Blog: SASnrd

nwang5 · Posted 09-09-2023 04:12 PM

Sorry for the confusion. First obs recurrent depression in depression 12. The time periods between depression12 and depression9 is six. The time gap between each depression is two years.
Second obs recurrent depression in 14 and the time periods between depression 14 and depression 9 is ten.

Amir · Posted 09-09-2023 04:55 PM

Hi,

I think I'm missing something here.

I have assumed that a diagnosis of depression is indicated with "√", is this correct? If yes, then please respond to the below, otherwise please explain what the symbols mean in the table in the question, thanks.

Looking at the table in the question I see depression for obs 1 is marked under Depression10 and Depression12; for observation 2 it is Depression11 and Depression14.:

But in a later post you refer to obs1 having D9 & D12 (table shows D10 & D12) and obs 2 having D9 & D14 (table shows D11 & D14), but I cannot match that description with the original table that was posted:

Sorry for the confusion. First obs recurrent depression in depression 12. The time periods between depression12 and depression9 is six. The time gap between each depression is two years.
Second obs recurrent depression in 14 and the time periods between depression 14 and depression 9 is ten.

Please clarify any misunderstanding I might have.

Thanks & kind regards,

Amir.

nwang5 · Posted 09-09-2023 06:03 PM

Yes, a diagnosis of depression is indicated with "√."
In a later post you refer to obs1 having D9 & D12, it refers to the time that obs participated in the study had recurent depression.

mkeintz · Posted 09-09-2023 11:40 PM

The options in my SAS session converts the "√" character to a "v" in the program below:

data have;
  input ID (depression9-depression15) (:$1.)   expected_outcome :$&13. exp_Survtime ;
datalines;
1    x    √    x    √    .    .    .    Recurrent       6
2    x    x    √    x    .    √    .    Recurrent      10
3    x    x    x    x    x    x    x    No Depression  12
4    x    .    √    x    √    .    .    Recurrent       8
5    x    .    .    √    √    √    √    Depression      6
6    x    √    .    .    x    √    .    Recurrent      10
7    x    .    √    .    .    x    √    Recurrent      12
8    x    x    .    .    .    x    .    No Depression  10
9    x    .    √    √    x    .    .    Depression      4
10   x    √    x    √    √    x    √    Recurrent       6
run;

data want (drop=_: d);
  set have;

  array dep {*} depression: ;

  _string=cat(of depression10-depression15);
  _npositive=lengthn(compress(_string,'x '));

  if _npositive<=1 then do;
    outcome='No Depression';
    survtime=2*(length(_string));
  end;
  else if countw(_string,'x ')=1 then do;
    outcome='Depression';
    survtime=2*findc(_string,'v');
  end;
  else do;
    outcome='Recurrent';
    do d=1 by 1 until (dep{d}='v'), d+1 to dim(dep) until (dep{d}='v' and dep{d-1}^='v');
    end;
    survtime=2*(d-1);
  end;
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

nwang5 · Posted 09-10-2023 12:34 PM

Thank you so much. I changed "√" as 1 and "x" as 0. I was wondering if there is any easier way to do it. Thanks!

mkeintz · Posted 09-10-2023 01:32 PM

@nwang5 wrote:
Thank you so much. I changed "√" as 1 and "x" as 0. I was wondering if there is any easier way to do it. Thanks!

I've already spent time and effort to convert your data table into a SAS data step once, in order to provide a tested program. It's not a habit I want to frequently indulge - it takes too much time, and can set unrealistic expectations.

Please provide the revised data in the form of a working data step. That will make it far easier for me to make it easier for you.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

nwang5 · Posted 09-10-2023 01:43 PM

I am sooo sorry. I am not very familiar with my dataset. I will accept your answer as a solution and start a new topic to show my real data. Sorry again.

How to create a new variable and how to calculate survival time for recurrent disease?

Re: How to create a new variable and how to calculate survival time for recurrent disease?

Re: How to create a new variable and how to calculate survival time for recurrent disease?

Re: How to create a new variable and how to calculate survival time for recurrent disease?

Re: How to create a new variable and how to calculate survival time for recurrent disease?

Re: How to create a new variable and how to calculate survival time for recurrent disease?

Re: How to create a new variable and how to calculate survival time for recurrent disease?

Re: How to create a new variable and how to calculate survival time for recurrent disease?

Re: How to create a new variable and how to calculate survival time for recurrent disease?

Re: How to create a new variable and how to calculate survival time for recurrent disease?

Re: How to create a new variable and how to calculate survival time for recurrent disease?

Re: How to create a new variable and how to calculate survival time for recurrent disease?

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away