BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MFraga
Quartz | Level 8

Hello,

 

I need to create the Kaplan-Meier curves to analysis, but I am finding inconsistencies when I compare with my results with Stata. I have export my data via "Stat/transfer" and then produce curves with Stata. In Stata, things look good, but I want to solve this problem and keep using SAS.

 

My dataset is in An example of my dataset is arranged longitudinally. An example of my dataset would be:

 

data have;

input 

 

id time1 event1 weight;

datalines;

1 0 0 0.8

1 1 0 0.8

1 2 0 0.8

1 3 0 0.8

1 4 0 0.8

15 0 0.8

1 6 0 0.8

1 7 0 0.8

1 8 0 0.8

1 9 0 0.8

1 10 0 0.8

1 11 0 0.8

1 12 0 0.8

1 13 0 0.8

2 0 0 1.1

2 1 1 1.1

2 2 . 1.1

3 0 0 1.01

3 1 0 1.01

3 2 1 1.01

3 3 . 1.01

4 0 1 0.98

4 1 . 0.98

4 2 . 0.98

4 3 . 0.98

4 4 . 0.98

5 0 0 1.13

6 0 0 1.05

6 1 0 1.05

6 2 0 1.05

6 3 0 1.05

6 4 0 1.05

6 5 1 1.05

6 6 . 1.05

6 7 . 1.05

6 8 . 1.05

7 0 0 0.89

7 1 0 0.89

7 2 0 0.89

7 3 0 0.89

7 4 0 0.89

7 5 0 0.89

7 6 0 0.89

7 7 0 0.89

7 8 1 0.89

7 9 . 0.89

7 10 . 0.89

8 0 0 1.1

8 1 0 1.1

8 2 0 1.1

8 3 . 1.1

8 4 . 1.1

;

 

 

run;

 

So I run the survival analysis like that:

 

proc lifetest data=have plots(s) graphics notable;

time time1*event1(0);

weight weight;

run;

 

My resulting graphic does not have the same proportion like in STATA when I use the same table coding like that to produce the survival curve:

 

stset time1 [pweight=weight], id(id) failure(event1=1)

sts graph

 

Does anyone know how I make SAS understand that my dataset is arranged longitudinally and control by the "id" the analysis that I want? Many thanks in advance!

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Which part do you think will be time consuming?

Here's an example of how you can filter your list.

 delete_lifetest.JPG


@MFraga wrote:

Yes, I know that this could be a possibility, but it will be time consuming. I think I will move to Stata for the analysis. Thanks anyway.


 

View solution in original post

10 REPLIES 10
ballardw
Super User

I don't speak STATA but you apparently are using the ID variable in the STATA but not in the proc lifetest. What role would that play in stata?

 

also your lifetest code as posted does not run (at least on my system).

This ran for me:

proc lifetest data=have plots=(s)  notable;
   time time1*event1(0);
   weight weight;
run;
MFraga
Quartz | Level 8

THanks for you answer. ID is one individual. My interest variable is event1. The time variable is time1. I want to understand the time that it takes to have an event1 for each individual (each ID). Does this information help ?

Reeza
Super User

You need to reduce it to a single line for each individual, so either the event1=1 record or the last record for each ID. 

 

Then you can run PROC LIFETEST and probably get the same results. You can check an example in the documentation for how the data set needs to be structured.

 

https://documentation.sas.com/api/docsets/statug/14.3/content/statug_code_liftex2.htm?locale=en

 


@MFraga wrote:

Hello,

 

I need to create the Kaplan-Meier curves to analysis, but I am finding inconsistencies when I compare with my results with Stata. I have export my data via "Stat/transfer" and then produce curves with Stata. In Stata, things look good, but I want to solve this problem and keep using SAS.

 

My dataset is in An example of my dataset is arranged longitudinally. An example of my dataset would be:

 

data have;

input 

 

id time1 event1 weight;

datalines;

1 0 0 0.8

1 1 0 0.8

1 2 0 0.8

1 3 0 0.8

1 4 0 0.8

15 0 0.8

1 6 0 0.8

1 7 0 0.8

1 8 0 0.8

1 9 0 0.8

1 10 0 0.8

1 11 0 0.8

1 12 0 0.8

1 13 0 0.8

2 0 0 1.1

2 1 1 1.1

2 2 . 1.1

3 0 0 1.01

3 1 0 1.01

3 2 1 1.01

3 3 . 1.01

4 0 1 0.98

4 1 . 0.98

4 2 . 0.98

4 3 . 0.98

4 4 . 0.98

5 0 0 1.13

6 0 0 1.05

6 1 0 1.05

6 2 0 1.05

6 3 0 1.05

6 4 0 1.05

6 5 1 1.05

6 6 . 1.05

6 7 . 1.05

6 8 . 1.05

7 0 0 0.89

7 1 0 0.89

7 2 0 0.89

7 3 0 0.89

7 4 0 0.89

7 5 0 0.89

7 6 0 0.89

7 7 0 0.89

7 8 1 0.89

7 9 . 0.89

7 10 . 0.89

8 0 0 1.1

8 1 0 1.1

8 2 0 1.1

8 3 . 1.1

8 4 . 1.1

;

 

 

run;

 

So I run the survival analysis like that:

 

proc lifetest data=have plots(s) graphics notable;

time time1*event1(0);

weight weight;

run;

 

My resulting graphic does not have the same proportion like in STATA when I use the same table coding like that to produce the survival curve:

 

stset time1 [pweight=weight], id(id) failure(event1=1)

sts graph

 

Does anyone know how I make SAS understand that my dataset is arranged longitudinally and control by the "id" the analysis that I want? Many thanks in advance!

 


 

MFraga
Quartz | Level 8

Yes, I know that this could be a possibility, but it will be time consuming. I think I will move to Stata for the analysis. Thanks anyway.

Reeza
Super User

Which part do you think will be time consuming?

Here's an example of how you can filter your list.

 delete_lifetest.JPG


@MFraga wrote:

Yes, I know that this could be a possibility, but it will be time consuming. I think I will move to Stata for the analysis. Thanks anyway.


 

mkeintz
PROC Star
I think you need to sort by ID TIME1, to guarantee that the last record for any ID without an event will have the latest time value.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Reeza
Super User
Yeah, I had modified the data slightly for testing because there's only 8 records otherwise....
MFraga
Quartz | Level 8

It is already sorted by id.

mkeintz
PROC Star

@MFraga

 

The initial proc sort is often provided by forum repondents just to demonstrate required  data order for the subsequent (more interesting and relevant) steps.  Often a person starting a topic may show ordered sample data for convenience, only to discover problems when the actual (unsorted) data is used.

 

If the real data are sorted (by ID and TIME1) then by all means drop the proc sort.   Just honor the Socratic dictum: know thy data.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Reeza
Super User
Not sure why the code didn't post before, only have an image now though.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 1960 views
  • 1 like
  • 4 in conversation