BookmarkSubscribeRSS Feed
JBerry
Quartz | Level 8

I have successfully built a survival model in EM which has a K-S of 30 in both Train, Valid, and Test. So I assume the model is pretty good.

Here are some facts about it:

- built it on a sample of 500K obs

- unexpanded data

- no time vary-ing covariates

- Forecasting 36 month-intervals

- Customer base ranges from tenure of 0 months to 250 months

- No truncation

The curves drawn by the Node Results are very nice. I can see hazard spikes during months that make sense (at 3,12,24,48 months). The survival curve also look nice, it descends as I'd expect.

However, now that I've scored the data, I'd like to replicate these curves by querying the results. However when I try - the curves look vastly different. When I graph _t_ (tenure of customer) vs Avg(EM_SURVEVENT) - my curve looks weird, in fact it even increases along the way!

Is there something wrong with the way I am trying to recreate these charts?

I tried graphing instant risk and subhazard functions against _T_ and it also did not match the model graphs, so I'm afraid there is something wrong.

Source of Model Curve:  SAS Survival Node => Results => click chart => Tables button at top of screen

Source of Scored Curve:

SELECT _T_ AS RELATIVE_TENURE, AVG(1-((EM_SURVIVAL-EM_SURVFCST)/EM_SURVIVAL)) AS S

FROM [Scored Results]

GROUP BY _T_;

1 REPLY 1
M_Maldonado
Barite | Level 11

Hey JBerry,

Not sure I get the second part, specially instant risk. But I am no expert in survival analysis. I use this node a lot, mostly to get hazard functions, but I am still very low on the learning curb.

It sounds to me like you were trying to get the survival function?

survival function.png

If I was to redo the survival function in EM, I would add something like the below in the SAS Code node. Notice that your survival node creates a dataset _ehcendata which summarizes events, event dates, and _y_.

You can use that to get the curves you were looking for.

Add the below in a SAS code node and connect it to your Survival node.

   data ehcendata;

   set &EM_LIB..&EM_METASOURCE_NODEID._ehcendata;

   run;

   proc lifetest data=ehcendata method=LT;

   time _y_*event(0);

   run;

If you did it in base SAS you would also get the plots (change em_lib for your workspace and em_metasource_nodeid for your survival node ID).

SurvivalPlot1.png

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 995 views
  • 0 likes
  • 2 in conversation