SAS Support Communities

SAS_VA_Learner · ‎07-29-2018

Hi Doug, Thanks a ton for your inputs. Just trying to summarize from your thoughts : 1) PROC LOGISTIC (Conventional MLE Estimates) & PROC LOGISTIC (Firth's Penalized Maximum Likelihood) : Not a viable option as per your comment as given below to which I fully agree. But at the same time , just wanted to know if I go with this approach then is a 70:30 split advisable between TRAIN and IN TIME VALIDATION considering the low response rate of 0.6% or TRAIN and OUT OF TIME VALIDATION only recommended ? "I tend to think in terms of how many actual events I have rather than how many total records I have. If I have 100 events and a 100 non-events, I have 200 observations. If I have 100 events and 99,900 non-events, do I really have that much more information? The signal in that case is so low (0.1%) that it would be difficult to have much confidence in any fitted model." 2) PROC LOGISTIC (Oversampled Rate of 5.77 %) : Splitting into TRAIN and INTIME VALIDATION is not recommended as per your comment as given below to which I fully agree. As INTIME VALIDATION is not recommended , then is OUT OF TIME VALIDATION the only option for model testing in this scenario ? "* The total number of events (374) is so low that I would consider not even splitting the data in this situation. Data Mining methods like partitioning assume that there are sufficient observations to represent the population in every partitioned data set. Splitting 70/30 leaves barely over 100 events in validation. Your data set might be better handle by classical statistical approaches given the limited data available. " 3) PROC LOGISTIC (Oversampled Rate of 5.77 %) : Apart from using a decision tree to understand more about the data , is there any other suggestion with regards to use of classical statistical approaches given the limited data available ? Thanks Surajit

goladin · ‎07-18-2011

There are two versions. One that is client-server and the other type is Workstation.

odmhx · ‎06-29-2011

Thank you very much for your codes. My problem is that I have two levels under main folder, which are named differently from week to week, and the flat files are also named differently from week to week.

Online Status	Offline
Date Last Visited	‎09-01-2015 07:11 AM

SAS Support Communities

Re: calling E-miner from SAS EG code?

Re: Error in SAS Desktop EM

Re: Multi-objective optimization

Re: Mixed integer linear program without SAS/OR

Re: Book: Applied Operational Research with SAS uses SAS/OR?

Forecasting Quarterly Data

Is EM a client-server system?

QLIM - ERROR: There are no valid observations???

How to read in files from different directories

A Question on Modeling Rare Events Data

Ask for help: SAS/IML study material

Re: A Question on Modeling Rare Events Data

Is EM a client-server system?

Re: How to read in files from different directories

Follow Us

What is...