BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Anita_n
Pyrite | Level 9

Dear all,

I have a dataset that looks like this

pid tum_typ sex   date_of_diagnosis

1     2           2      10/04/2015

1     2           2       01/06/2010

1     2           2      11/03/2019

2     2          1        09/09/2000

2     2          1        05/08/2018

3     2          1        02/10/2020

4     2          2        11/02/2001

4     2          2        15/12/1998

5     2         1

5     2         1          07/07/2020

5     2         1          06/01/2020

 

I wish to create a new variable called order_of_tum which indicates the 1., 2. 3., etc of tumor a patient has grouped by id and date. The result should look like this :

That means I compare the first and second row and if date 1 is less than date2 then date1 =1. and date2=2.

 

pid tum_typ sex   date_of_diagnosis  order_of_tum

1     2           2      01/06/2010                1

1     2           2      10/04/2015                2

1     2           2      11/03/2019                3

2     2          1        09/09/2000              1

2     2          1        05/08/2018              2

3     2          1        02/10/2020              1

4     2          2        15/12/1998              1

4     2          2        11/02/2001               2

5     2         1          06/01/2020             1

5     2         1          07/07/2020             2

5     2         1                                         3

                           

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

I think this is what you want

 

data have;
infile datalines missover;
input pid tum_typ sex date_of_diagnosis : ddmmyy10.;
format date_of_diagnosis ddmmyy10.;
datalines;
1 2 2 10/04/2015
1 2 2 01/06/2010
1 2 2 11/03/2019
2 2 1 09/09/2000
2 2 1 05/08/2018
3 2 1 02/10/2020
4 2 2 11/02/2001
4 2 2 15/12/1998
5 2 1           
5 2 1 07/07/2020
5 2 1 06/01/2020
;

proc sort data = have;
   by pid date_of_diagnosis;
run;

data want;
   set have;
   by pid date_of_diagnosis;
   if first.pid then order_of_tum = 0;
   if first.date_of_diagnosis then order_of_tum + 1;
run;

View solution in original post

3 REPLIES 3
PeterClemmensen
Tourmaline | Level 20

I think this is what you want

 

data have;
infile datalines missover;
input pid tum_typ sex date_of_diagnosis : ddmmyy10.;
format date_of_diagnosis ddmmyy10.;
datalines;
1 2 2 10/04/2015
1 2 2 01/06/2010
1 2 2 11/03/2019
2 2 1 09/09/2000
2 2 1 05/08/2018
3 2 1 02/10/2020
4 2 2 11/02/2001
4 2 2 15/12/1998
5 2 1           
5 2 1 07/07/2020
5 2 1 06/01/2020
;

proc sort data = have;
   by pid date_of_diagnosis;
run;

data want;
   set have;
   by pid date_of_diagnosis;
   if first.pid then order_of_tum = 0;
   if first.date_of_diagnosis then order_of_tum + 1;
run;
Anita_n
Pyrite | Level 9
Thank you
Anita_n
Pyrite | Level 9

Hi what happens when I have the same dates within a group? In that case don't want it to count further but out the number as previous.

for example if patient 1,  there is a 4th row  with 

1 2 2 11/03/2019

I don't want  to have order_of_tum like (1, 2, 3, 4)  but  (1, 2 ,3 ,3) any solution to that?

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 932 views
  • 0 likes
  • 2 in conversation