Survival analysis is the statistical modeling of the time until an event occurs that typically includes incomplete, or censored, data. Many methods are available for performing this analysis in a medical research setting, but are not suitable for creating predictive models from large customer databases. Survival data mining adapts survival analysis to a data mining context, where companies might be interested in predicting when, not if, a customer event such as churn might occur. While traditional data mining methods can model the probability of the event during a set, pre-determined time period, survival data mining allows the event likelihood to be calculated as a function of time.
The Survival node found on the Applications tab of the node toolbar in SAS® Enterprise Miner™ performs survival data mining using a discrete-time logistic hazard model.
This flexible model accommodates competing risks and nonlinear hazard functions. The results of this node can help you answer questions like:
The Introduction to Survival Data Mining video found at SAS® Data Mining and SAS® Text Mining Videos provides details about the way this model is implemented and how it can answer these questions for you.
In SAS Enterprise Miner 12.3, the Survival node added capabilities to support left-truncated data and time-dependent covariates (covariates that change value over time), both of which commonly occur in survival analysis and can be handled by the discrete-time hazard model. Two ways of expanding your data to accommodate time-dependent covariates are supported, and three sample data sets are provided with the software to help guide you in structuring your data set. See the video New Features in the SAS Enterprise Miner 12.3 Survival Node found at the link given above for more details. The SAS Enterprise Miner Reference Help also has information on preparing your data for the Survival node, both in general and when using one of the expanded data formats for time-dependent covariates.