BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
nwang5
Obsidian | Level 7

Hi SAS community,

 

I hope you are doing well. I am currently delving into the investigation of risk factors associated with recurrent depression using a longitudinal dataset. In this dataset, I have the variables 'ID' and 'Depression9-Depression15' as original variables. My objective is to create a three-categorical outcome, which comprises 'recurrent,' 'depression,' and 'no depression.' Specifically, 'recurrent' should be defined as experiencing depression for the second time.

 

It's worth noting that all the participants in 'Depression9' did not experience depression. I would greatly appreciate your guidance on creating this outcome and calculating survival time in the two highlighted columns. Thank you for your assistance!

 

ID Depression9 Depression10 Depression11 Depression12 Depression13 Depression14 Depression15 outcome Survival time
1 x x . . . Recurrent 6
2 x x x . . Recurrent 10
3 x x x x x x x No Depression 12
4 x . x . . Recurrent 8
5 x . . Depression 6
6 x . . x . Recurrent 10
7 x . . . x Recurrent 12
8 x x . . . x . No Depression 10
9 x . x . . Depression 4
10 x x x Recurrent 6
1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

 

 

The options in my SAS session converts the "√" character to a "v" in the program below:

 

data have;
  input ID (depression9-depression15) (:$1.)   expected_outcome :$&13. exp_Survtime ;
datalines;
1    x    √    x    √    .    .    .    Recurrent       6
2    x    x    √    x    .    √    .    Recurrent      10
3    x    x    x    x    x    x    x    No Depression  12
4    x    .    √    x    √    .    .    Recurrent       8
5    x    .    .    √    √    √    √    Depression      6
6    x    √    .    .    x    √    .    Recurrent      10
7    x    .    √    .    .    x    √    Recurrent      12
8    x    x    .    .    .    x    .    No Depression  10
9    x    .    √    √    x    .    .    Depression      4
10   x    √    x    √    √    x    √    Recurrent       6
run;

data want (drop=_: d);
  set have;

  array dep {*} depression: ;

  _string=cat(of depression10-depression15);
  _npositive=lengthn(compress(_string,'x '));

  if _npositive<=1 then do;
    outcome='No Depression';
    survtime=2*(length(_string));
  end;
  else if countw(_string,'x ')=1 then do;
    outcome='Depression';
    survtime=2*findc(_string,'v');
  end;
  else do;
    outcome='Recurrent';
    do d=1 by 1 until (dep{d}='v'), d+1 to dim(dep) until (dep{d}='v' and dep{d-1}^='v');
    end;
    survtime=2*(d-1);
  end;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

10 REPLIES 10
nwang5
Obsidian | Level 7
Survival time is defined as the occurrence of recurrent depression, specifically the second instance of depression
PeterClemmensen
Tourmaline | Level 20

I still don't get it. Why is survival time = 6 in the first obs? And 10 in the second?

nwang5
Obsidian | Level 7
Sorry for the confusion. First obs recurrent depression in depression 12. The time periods between depression12 and depression9 is six. The time gap between each depression is two years.
Second obs recurrent depression in 14 and the time periods between depression 14 and depression 9 is ten.
Amir
PROC Star

Hi,

 

I think I'm missing something here.

 

I have assumed that a diagnosis of depression is indicated with "", is this correct? If yes, then please respond to the below, otherwise please explain what the symbols mean in the table in the question, thanks.

 

Looking at the table in the question I see depression for obs 1 is marked under Depression10 and Depression12; for observation 2 it is Depression11 and Depression14.:

 

Amir_0-1694291850382.png

 

 

But in a later post you refer to obs1 having D9 & D12 (table shows D10 & D12) and obs 2 having D9 & D14 (table shows D11 & D14), but I cannot match that description with the original table that was posted:

 

Sorry for the confusion. First obs recurrent depression in depression 12. The time periods between depression12 and depression9 is six. The time gap between each depression is two years.
Second obs recurrent depression in 14 and the time periods between depression 14 and depression 9 is ten.

 

Please clarify any misunderstanding I might have.

 

 

Thanks & kind regards,

Amir.

nwang5
Obsidian | Level 7
Yes, a diagnosis of depression is indicated with "√."
In a later post you refer to obs1 having D9 & D12, it refers to the time that obs participated in the study had recurent depression.
mkeintz
PROC Star

 

 

The options in my SAS session converts the "√" character to a "v" in the program below:

 

data have;
  input ID (depression9-depression15) (:$1.)   expected_outcome :$&13. exp_Survtime ;
datalines;
1    x    √    x    √    .    .    .    Recurrent       6
2    x    x    √    x    .    √    .    Recurrent      10
3    x    x    x    x    x    x    x    No Depression  12
4    x    .    √    x    √    .    .    Recurrent       8
5    x    .    .    √    √    √    √    Depression      6
6    x    √    .    .    x    √    .    Recurrent      10
7    x    .    √    .    .    x    √    Recurrent      12
8    x    x    .    .    .    x    .    No Depression  10
9    x    .    √    √    x    .    .    Depression      4
10   x    √    x    √    √    x    √    Recurrent       6
run;

data want (drop=_: d);
  set have;

  array dep {*} depression: ;

  _string=cat(of depression10-depression15);
  _npositive=lengthn(compress(_string,'x '));

  if _npositive<=1 then do;
    outcome='No Depression';
    survtime=2*(length(_string));
  end;
  else if countw(_string,'x ')=1 then do;
    outcome='Depression';
    survtime=2*findc(_string,'v');
  end;
  else do;
    outcome='Recurrent';
    do d=1 by 1 until (dep{d}='v'), d+1 to dim(dep) until (dep{d}='v' and dep{d-1}^='v');
    end;
    survtime=2*(d-1);
  end;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
nwang5
Obsidian | Level 7
Thank you so much. I changed "√" as 1 and "x" as 0. I was wondering if there is any easier way to do it. Thanks!
mkeintz
PROC Star

@nwang5 wrote:
Thank you so much. I changed "√" as 1 and "x" as 0. I was wondering if there is any easier way to do it. Thanks!

I've already spent time and effort to convert your data table into a SAS data step once, in order to provide a tested program.  It's not a habit I want to frequently indulge - it takes too much time, and can set unrealistic expectations.

 

Please provide the revised data in the form of a working data step.  That will make it far easier for me to make it easier for you.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
nwang5
Obsidian | Level 7

I am sooo sorry. I am not very familiar with my dataset. I will accept your answer as a solution and start a new topic to show my real data. Sorry again.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2191 views
  • 0 likes
  • 4 in conversation