09-01-2015
goladin
Calcite | Level 5
Member since
06-23-2011
- 24 Posts
- 0 Likes Given
- 0 Solutions
- 3 Likes Received
-
Latest posts by goladin
Subject Views Posted 1048 05-11-2012 12:47 PM 910 05-11-2012 12:43 PM 2714 05-11-2012 12:40 PM 1489 05-11-2012 12:39 PM 1427 05-11-2012 12:36 PM 1098 10-31-2011 11:01 AM 1100 07-18-2011 03:16 AM 1095 06-27-2011 12:32 PM 4523 06-27-2011 12:22 PM 18786 06-26-2011 12:14 PM -
Activity Feed for goladin
- Got a Like for Ask for help: SAS/IML study material. 09-01-2015 04:24 AM
- Posted Re: calling E-miner from SAS EG code? on SAS Data Science. 05-11-2012 12:47 PM
- Posted Re: Error in SAS Desktop EM on SAS Data Science. 05-11-2012 12:43 PM
- Posted Re: Multi-objective optimization on Mathematical Optimization, Discrete-Event Simulation, and OR. 05-11-2012 12:40 PM
- Posted Re: Mixed integer linear program without SAS/OR on Mathematical Optimization, Discrete-Event Simulation, and OR. 05-11-2012 12:39 PM
- Posted Re: Book: Applied Operational Research with SAS uses SAS/OR? on Mathematical Optimization, Discrete-Event Simulation, and OR. 05-11-2012 12:36 PM
- Posted Forecasting Quarterly Data on SAS Forecasting and Econometrics. 10-31-2011 11:01 AM
- Posted Is EM a client-server system? on SAS Data Science. 07-18-2011 03:16 AM
- Posted QLIM - ERROR: There are no valid observations??? on SAS Procedures. 06-27-2011 12:32 PM
- Posted How to read in files from different directories on SAS Procedures. 06-27-2011 12:22 PM
- Posted A Question on Modeling Rare Events Data on SAS Data Science. 06-26-2011 12:14 PM
- Posted Ask for help: SAS/IML study material on SAS/IML Software and Matrix Computations. 06-26-2011 12:11 PM
- Posted Re: VaR and Optimization Code on Mathematical Optimization, Discrete-Event Simulation, and OR. 04-14-2011 02:27 AM
- Posted Re: Modeling Non Stationary Arrival and Service Processes on Mathematical Optimization, Discrete-Event Simulation, and OR. 04-13-2011 11:28 PM
- Posted Re: SAS / Analytics and Career Path on SAS Data Science. 11-22-2010 09:14 PM
- Posted Re: Bin continous variable wrt. distributional properties on SAS Data Science. 11-22-2010 09:10 PM
- Posted Re: Variable Transofrmation on SAS Data Science. 11-22-2010 09:05 PM
- Posted Re: Data Mining without using EM on SAS Data Science. 11-22-2010 08:59 PM
- Posted Re: Building Decision Trees on SAS Data Science. 11-22-2010 08:56 PM
- Posted Re: Factor analysis on Statistical Procedures. 11-10-2010 06:28 AM
-
My Liked Posts
Subject Likes Posted 3 06-26-2011 12:11 PM
07-29-2018
06:26 PM
Hi Doug, Thanks a ton for your inputs. Just trying to summarize from your thoughts : 1) PROC LOGISTIC (Conventional MLE Estimates) & PROC LOGISTIC (Firth's Penalized Maximum Likelihood) : Not a viable option as per your comment as given below to which I fully agree. But at the same time , just wanted to know if I go with this approach then is a 70:30 split advisable between TRAIN and IN TIME VALIDATION considering the low response rate of 0.6% or TRAIN and OUT OF TIME VALIDATION only recommended ? "I tend to think in terms of how many actual events I have rather than how many total records I have. If I have 100 events and a 100 non-events, I have 200 observations. If I have 100 events and 99,900 non-events, do I really have that much more information? The signal in that case is so low (0.1%) that it would be difficult to have much confidence in any fitted model." 2) PROC LOGISTIC (Oversampled Rate of 5.77 %) : Splitting into TRAIN and INTIME VALIDATION is not recommended as per your comment as given below to which I fully agree. As INTIME VALIDATION is not recommended , then is OUT OF TIME VALIDATION the only option for model testing in this scenario ? "* The total number of events (374) is so low that I would consider not even splitting the data in this situation. Data Mining methods like partitioning assume that there are sufficient observations to represent the population in every partitioned data set. Splitting 70/30 leaves barely over 100 events in validation. Your data set might be better handle by classical statistical approaches given the limited data available. " 3) PROC LOGISTIC (Oversampled Rate of 5.77 %) : Apart from using a decision tree to understand more about the data , is there any other suggestion with regards to use of classical statistical approaches given the limited data available ? Thanks Surajit
... View more
07-18-2011
03:16 AM
There are two versions. One that is client-server and the other type is Workstation.
... View more
06-29-2011
11:32 PM
Thank you very much for your codes. My problem is that I have two levels under main folder, which are named differently from week to week, and the flat files are also named differently from week to week.
... View more