Objective
Obtain a final dataset, with 1 row for each unique customer id, + 36 columns which represent subhazard1 probability in each of 36 future periods from now. + 36 columns for subhazard2 probability.
Using the Survival node, I'm able to obtain the probability at period +36 just fine, just not all of the 35 periods in between.
My setup
I have an un-expanded dataset, where each row is a unique ID. I have a start_date, and an end_date, and an indicator for y = 0 (still active) 1 (voluntary end) or 2 (involuntary end). if y = 0 the end date is blank. There are also 16 numeric explanatory variables with no missing values. I am running the data through Enterprise Miner like this: [Data Source] --> [Data Partition] --> [Survival]
[Survival] Node Settings:
Data Format: Standard
Time Interval: Month
Left-Truncated: No
Left Training Time Range: 1/1/1995 - 5/31/2015
Sampling: No
Covariate x Time Interactions: Do Not Include
Survival Validation Method:Default
Mean Residual Life: None
Default Forecast Intervals: No
Number of Forecast Intervals: 36
Training Data Example:
id | start_date | end_date | y | x1 | x2 | x3 | x14 | x15 | x16 |
---|---|---|---|---|---|---|---|---|---|
100010 | 01May2010 | . | 0 | -0.993465152 | -0.066721505 | -0.524761341 | 0.851112375 | -0.200641128 | -0.026184704 |
100024 | 01Jul2009 | 18Oct2012 | 1 | 0.640795598 | 0.510798998 | 0.313453708 | 0.851112375 | -0.415644559 | -0.026184704 |
100090 | 01Oct2002 | . | 0 | -1.313670924 | 1.145826894 | 0.313453708 | 0.851112375 | -0.531334657 | -0.026184704 |
100170 | 01Jul2010 | . | 0 | 1.015422702 | 0.765797739 | -1.498333852 | 0.851112375 | -0.657759972 | -0.026184704 |
100182 | 01Jun2003 | 12Mar2013 | 1 | 0.2238052 | -0.558561508 | -0.4076856 | -0.561356819 | -0.200641128 | -0.026184704 |
100218 | 01Dec2002 | . | 0 | -0.2292475 | -0.558561508 | 0.072210717 | -0.561356819 | 0.153832731 | -0.026184704 |
100286 | 01May2006 | . | 0 | -0.361361614 | -0.558561508 | 0.313453708 | -0.561356819 | 0.608860331 | -0.026184704 |
100304 | 01Oct2008 | 12Aug2014 | 1 | -0.143914825 | -0.558561508 | -0.524761341 | -0.561356819 | -0.657759972 | -0.026184704 |
100316 | 01Oct2008 | 20Apr2014 | 1 | -0.2292475 | 0.765797739 | 0.313453708 | 0.851112375 | -0.038251766 | -0.026184704 |
100340 | 01Jun2010 | . | 0 | -0.143914825 | -1.669116518 | -0.524761341 | -0.561356819 | -0.200641128 | -0.026184704 |
100418 | 01Jul2009 | 09Nov2012 | 1 | 0.450310375 | 0.269509137 | -0.4076856 | -0.561356819 | 0.006216554 | -0.026184704 |
100440 | 21Oct2010 | 22Apr2014 | 2 | 0.640795598 | -0.558561508 | 0.072210717 | -0.561356819 | 0.608860331 | -0.026184704 |
100444 | 01Sep2008 | . | 0 | -0.454616577 | 0.765797739 | -0.4076856 | -0.561356819 | 0.476907328 | -0.026184704 |
100458 | 01Oct2000 | . | 0 | -0.7113196 | -0.066721505 | 0.313453708 | -0.561356819 | 0.153832731 | -0.026184704 |
100460 | 01Dec2004 | . | 0 | -0.272496522 | -0.066721505 | 0.072210717 | 0.851112375 | 0.153832731 | -0.026184704 |
100520 | 01Mar2004 | . | 0 | -0.822476278 | -1.596288431 | 0.313453708 | -0.561356819 | -0.200641128 | -0.026184704 |
100564 | 01Feb2009 | . | 0 | 0.960178211 | -0.066721505 | -0.524761341 | -0.561356819 | 0.608860331 | -0.026184704 |
100578 | 01Apr2009 | . | 0 | -1.494276815 | 0.269509137 | 0.313453708 | -0.561356819 | -0.518154111 | -0.026184704 |
100606 | 01Jul2001 | . | 0 | -0.2292475 | -1.519998515 | 0.313453708 | -0.561356819 | 0.608860331 | -0.026184704 |
100626 | 01Aug2003 | 02Jul2012 | 1 | 0.332638254 | 0.510798998 | 0.072210717 | 0.851112375 | 0.153832731 | -0.026184704 |
Data for Scoring Example:
id | start_date | end_date | x1 | x2 | x3 | x14 | x15 | x16 |
---|---|---|---|---|---|---|---|---|
100010 | 01May2010 | . | -0.993465152 | -0.066721505 | -0.524761341 | 0.851112375 | -0.200641128 | -0.026184704 |
100090 | 01Oct2002 | . | -1.313670924 | 1.145826894 | 0.313453708 | 0.851112375 | -0.531334657 | -0.026184704 |
100170 | 01Jul2010 | . | 1.015422702 | 0.765797739 | -1.498333852 | 0.851112375 | -0.657759972 | -0.026184704 |
100218 | 01Dec2002 | . | -0.2292475 | -0.558561508 | 0.072210717 | -0.561356819 | 0.153832731 | -0.026184704 |
100286 | 01May2006 | . | -0.361361614 | -0.558561508 | 0.313453708 | -0.561356819 | 0.608860331 | -0.026184704 |
100340 | 01Jun2010 | . | -0.143914825 | -1.669116518 | -0.524761341 | -0.561356819 | -0.200641128 | -0.026184704 |
100444 | 01Sep2008 | . | -0.454616577 | 0.765797739 | -0.4076856 | -0.561356819 | 0.476907328 | -0.026184704 |
100458 | 01Oct2000 | . | -0.7113196 | -0.066721505 | 0.313453708 | -0.561356819 | 0.153832731 | -0.026184704 |
100460 | 01Dec2004 | . | -0.272496522 | -0.066721505 | 0.072210717 | 0.851112375 | 0.153832731 | -0.026184704 |
100520 | 01Mar2004 | . | -0.822476278 | -1.596288431 | 0.313453708 | -0.561356819 | -0.200641128 | -0.026184704 |
100564 | 01Feb2009 | . | 0.960178211 | -0.066721505 | -0.524761341 | -0.561356819 | 0.608860331 | -0.026184704 |
100578 | 01Apr2009 | . | -1.494276815 | 0.269509137 | 0.313453708 | -0.561356819 | -0.518154111 | -0.026184704 |
100606 | 01Jul2001 | . | -0.2292475 | -1.519998515 | 0.313453708 | -0.561356819 | 0.608860331 | -0.026184704 |
100638 | 01Mar2008 | . | -0.361361614 | -0.558561508 | -0.524761341 | -0.561356819 | 0.153832731 | -0.026184704 |
100668 | 01Jan2008 | . | 0.2238052 | -0.558561508 | -0.524761341 | -0.561356819 | -0.200641128 | -0.026184704 |
100764 | 01Jan2010 | . | -0.407366285 | 0.765797739 | 0.313453708 | 0.851112375 | -0.518154111 | -0.026184704 |
100880 | 01Jan2002 | . | 0.277114766 | -0.066721505 | 0.072210717 | -0.561356819 | 0.153832731 | -0.026184704 |
100928 | 01Dec1996 | . | -0.503209025 | -0.066721505 | 0.072210717 | -0.561356819 | -0.518154111 | -0.026184704 |
101012 | 01Jan2005 | . | 0.075963483 | -0.066721505 | 0.072210717 | -0.561356819 | -0.43304346 | -0.026184704 |
101026 | 01Apr2010 | . | -1.1612639 | 1.145826894 | 0.313453708 | 0.257502307 | -0.038251766 | -0.026184704 |
101194 | 01May2010 | . | 0.8372129 | 0.269509137 | -1.303489188 | 0.851112375 | -0.32305167 | -0.026184704 |
101218 | 01Jun2010 | . | -1.399701071 | -0.066721505 | -1.303489188 | 0.851112375 | 0.006216554 | -0.026184704 |
101280 | 01Sep2005 | . | -0.361361614 | -0.066721505 | 0.072210717 | -0.561356819 | -0.657759972 | -0.026184704 |
101312 | 01Jun2007 | . | -0.553195345 | -0.96472101 | 0.072210717 | 0.257502307 | -0.657759972 | -0.026184704 |
101374 | 01May1999 | . | 1.015422702 | -0.558561508 | 0.313453708 | -0.561356819 | 0.153832731 | -0.026184704 |
101394 | 01Apr2008 | . | 0.172619332 | -0.066721505 | -1.498333852 | 0.851112375 | 0.153832731 | -0.026184704 |
101404 | 01Mar2006 | . | -0.503209025 | -1.519998515 | -0.4076856 | -0.561356819 | -0.531334657 | -0.026184704 |
101414 | 01Jul2004 | . | 0.075963483 | -0.96472101 | 0.313453708 | -0.561356819 | -0.331707518 | -0.026184704 |
101534 | 01Apr2010 | . | -1.468078164 | 0.765797739 | -1.303489188 | 0.851112375 | -0.518154111 | 0.305042049 |
101566 | 01Sep2010 | . | -0.7113196 | 0.765797739 | 0.313453708 | -0.561356819 | 0.476907328 | -0.026184704 |
Scored Data Sample:
n_account_id | START_DATE | EM_SURVIVAL | EM_SURVFCST | EM_SURVEVENT | EM_HAZARD | EM_HZRDFCST | _T_ | T_FCST | EM_SUBHZRD1 | EM_SUBHZRD2 | EM_SUBHZRD0 | EM_SUBHZRD1_SURV | EM_SUBHZRD2_SURV | EM_SUBHZRD0_SURV |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
100090 | 01Oct2002 | 0.972084662 | 0.454812409 | 0.532126751 | 0.027915338 | 0.011483039 | 0 | 36 | 0.012369578 | 0.015545761 | 0.972084662 | 0.008407648 | 0.003075391 | 0.988516961 |
100170 | 01Jul2010 | 0.980428083 | 0.595018696 | 0.393103169 | 0.019571917 | 0.006805636 | 0 | 36 | 0.006142343 | 0.013429575 | 0.980428083 | 0.004159031 | 0.002646605 | 0.993194364 |
100218 | 01Dec2002 | 0.975502536 | 0.491930957 | 0.495715347 | 0.024497464 | 0.010824647 | 0 | 36 | 0.012470412 | 0.012027052 | 0.975502536 | 0.008452113 | 0.002372534 | 0.989175353 |
100286 | 01May2006 | 0.993152694 | 0.808182911 | 0.18624506 | 0.006847306 | 0.003659073 | 0 | 36 | 0.004885971 | 0.001961335 | 0.993152694 | 0.003276291 | 0.000382782 | 0.996340927 |
100444 | 01Sep2008 | 0.978615459 | 0.523747551 | 0.464807605 | 0.021384541 | 0.010562998 | 0 | 36 | 0.013268177 | 0.008116364 | 0.978615459 | 0.008966582 | 0.001596416 | 0.989437002 |
100458 | 01Oct2000 | 0.987513584 | 0.679403446 | 0.312005974 | 0.012486416 | 0.006569076 | 0 | 36 | 0.00865403 | 0.003832387 | 0.987513584 | 0.005819057 | 0.000750019 | 0.993430924 |
100460 | 01Dec2004 | 0.985659994 | 0.633627574 | 0.357154011 | 0.014340006 | 0.008021131 | 0 | 36 | 0.01093209 | 0.003407916 | 0.985659994 | 0.007353906 | 0.000667225 | 0.991978869 |
100520 | 01Mar2004 | 0.989466858 | 0.716193892 | 0.276182031 | 0.010533142 | 0.00585651 | 0 | 36 | 0.007976627 | 0.002556515 | 0.989466858 | 0.005356815 | 0.000499695 | 0.99414349 |
100578 | 01Apr2009 | 0.987593851 | 0.688728613 | 0.302619582 | 0.012406149 | 0.006086399 | 0 | 36 | 0.007669291 | 0.004736857 | 0.987593851 | 0.005158995 | 0.000927404 | 0.993913601 |
100606 | 01Jul2001 | 0.993015545 | 0.802277033 | 0.192080088 | 0.006984455 | 0.003853459 | 0 | 36 | 0.00523903 | 0.001745425 | 0.993015545 | 0.003512834 | 0.000340625 | 0.996146541 |
100764 | 01Jan2010 | 0.982877499 | 0.613561701 | 0.375749571 | 0.017122501 | 0.007328854 | 0 | 36 | 0.008284089 | 0.008838412 | 0.982877499 | 0.005592299 | 0.001736555 | 0.992671146 |
100880 | 01Jan2002 | 0.989436032 | 0.731832684 | 0.260353717 | 0.010563968 | 0.004980109 | 0 | 36 | 0.006113659 | 0.004450308 | 0.989436032 | 0.004109461 | 0.000870648 | 0.995019891 |
100928 | 01Dec1996 | 0.989087318 | 0.717395186 | 0.274689733 | 0.010912682 | 0.005523562 | 0 | 36 | 0.007113161 | 0.003799521 | 0.989087318 | 0.004780377 | 0.000743185 | 0.994476438 |
101012 | 01Jan2005 | 0.988657426 | 0.706723152 | 0.285168823 | 0.011342574 | 0.00581433 | 0 | 36 | 0.007545221 | 0.003797353 | 0.988657426 | 0.005071463 | 0.000742867 | 0.99418567 |
101026 | 01Apr2010 | 0.966975295 | 0.399236355 | 0.587128692 | 0.033024705 | 0.012964184 | 0 | 36 | 0.013244448 | 0.019780257 | 0.966975295 | 0.009036307 | 0.003927877 | 0.987035816 |
101194 | 01May2010 | 0.978516373 | 0.497653615 | 0.491420247 | 0.021483627 | 0.012507562 | 0 | 36 | 0.017335001 | 0.004148626 | 0.978516373 | 0.011693086 | 0.000814476 | 0.987492438 |
101218 | 01Jun2010 | 0.995085646 | 0.848719344 | 0.14708915 | 0.004914354 | 0.003073292 | 0 | 36 | 0.004456077 | 0.000458276 | 0.995085646 | 0.002983975 | 8.93179E-05 | 0.996926708 |
101280 | 01Sep2005 | 0.991028543 | 0.770056197 | 0.222972736 | 0.008971457 | 0.00408317 | 0 | 36 | 0.004891316 | 0.004080141 | 0.991028543 | 0.003285505 | 0.000797665 | 0.99591683 |
101312 | 01Jun2007 | 0.983469511 | 0.606349406 | 0.383458868 | 0.016530489 | 0.008214977 | 0 | 36 | 0.010404358 | 0.006126132 | 0.983469511 | 0.007013124 | 0.001201853 | 0.991785023 |
101374 | 01May1999 | 0.99192727 | 0.777587464 | 0.216084195 | 0.00807273 | 0.004328435 | 0 | 36 | 0.005785803 | 0.002286927 | 0.99192727 | 0.003881857 | 0.000446578 | 0.995671565 |
HI JBerry,
Thanks for the detailed question!!! Nice screenshots and awesome profile pic BTW!
I borrowed this idea from Wendy Czika. Do this:
1. Add a Score node after your Survivial node and run it. Open the results and copy the Optimized code.
2. Connect a SAS Code node after your Data set. Open the editor and add this:
-libname statement to create a library (remember that valid library names are 8 characters or less)
-data statement to output your results
-set statement for your data set
-the optimized score code you copied from the Score node
-run statement
It would look something like this:
libname results "D:\EM\EM_Projects\EM13.2\miguel";
data results.scored36m;
set &EM_IMPORT_DATA;
/* your optimized core code goes here */
run;
4. Scroll all the way down to the last part of the optimized score code you pasted. Right before the end you will see the part where the macro EM_SURVEVENT calcualtes the survival probability. Add the code highlighted in yellow. It creates the variable IntervalsInFuture and outputs the calculations for all periods after _t0_.
/***** omitted lines of code ******/
if _T_=t0_fcst then EM_SURVEVENT=(EM_SURVIVAL-EM_SURVFCST)/0.00001;
end;
/*just to be able to easier see how many months after the censor date we are looking at */
IntervalsInFuture = _t_ - _t0_;
/* output each period */
if _t_ >= _t0_ then output;
_t_+1;
end;
_T_ = _T0_;
;
end;
5. Open your results data set to confirm this worked as intended. I just tried this with my go-to example and it worked great.
I hope this helps!
Thanks,
HI JBerry,
Thanks for the detailed question!!! Nice screenshots and awesome profile pic BTW!
I borrowed this idea from Wendy Czika. Do this:
1. Add a Score node after your Survivial node and run it. Open the results and copy the Optimized code.
2. Connect a SAS Code node after your Data set. Open the editor and add this:
-libname statement to create a library (remember that valid library names are 8 characters or less)
-data statement to output your results
-set statement for your data set
-the optimized score code you copied from the Score node
-run statement
It would look something like this:
libname results "D:\EM\EM_Projects\EM13.2\miguel";
data results.scored36m;
set &EM_IMPORT_DATA;
/* your optimized core code goes here */
run;
4. Scroll all the way down to the last part of the optimized score code you pasted. Right before the end you will see the part where the macro EM_SURVEVENT calcualtes the survival probability. Add the code highlighted in yellow. It creates the variable IntervalsInFuture and outputs the calculations for all periods after _t0_.
/***** omitted lines of code ******/
if _T_=t0_fcst then EM_SURVEVENT=(EM_SURVIVAL-EM_SURVFCST)/0.00001;
end;
/*just to be able to easier see how many months after the censor date we are looking at */
IntervalsInFuture = _t_ - _t0_;
/* output each period */
if _t_ >= _t0_ then output;
_t_+1;
end;
_T_ = _T0_;
;
end;
5. Open your results data set to confirm this worked as intended. I just tried this with my go-to example and it worked great.
I hope this helps!
Thanks,
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.