Hello Jason: I think you can help me. I am using EM 6.2 to predict a crash severity event "target "1" however it is a rare event hance it is a challange. My sample size is 1374 crash observations involving all injuriesand/or fatalities. The three data sets are: ALL (N=1374) with crashes including singleand two-vehicle collisions. SINGLE (N=500) for observations involving asingle vehicle. TWO (N=874) for two-vehicle collisions. Each crash observations report an injury and/or fatality.The predictions models are intended to target the probability of having eithera serious injury either and/or a fatality outcome given any injury, (P (Severe Injury or Fatality | Injury)).The Target is defined as a binary variable (Target = 1 for an event and 0 for anon-event). It is an imbalanced sample with the following characteristics: 5% severe injury or fatalities in the entiredataset. 3.7% severe injury or fatalities in for two-vehiclecrash dataset. 7.6% severe injury or fatalities for singlevehicle crash dataset. I am modeling the Target “1” which just happens 5% of thetime. Hence I did the oversampling and adjusting the priorities as follows. 1 st OVERSAMPLING to include all the rare events (Target “1”)at the sample and equal number of Target “0” -I add a sample node to the DataSource (the originalpopulation N=1374) in the new diagram (without partition node) -At sample node property panel, I set: 100.0 percent,Level Based, Rarest Level, Level Proportion 100.0, and Sample proportion 50,0. -I add LOGISTIC REGRESSION nodes -I add MODEL COMPARISON node -I add a SCORE Node to the model selected by the bestmodel node -I add new data source into the diagram, which is theoriginal population data table and the role set to “Score” -I add SAS CODE node to the Score node - I run the SAS Code and then I run the score node 2 nd ADJUSTING PROBABILITIES to predict the correctly the original distribution -I add a DECISION node following the modeling node(select model) At the decision node I set the prior probabilities as: a) Level “1”, Count (70), Prior (0.5), Adjusted Prior(0.05) b) Level “1”, Count (70), Prior (0.5), Adjusted Prior(0.95) c) I applied the decisions by setting “yes” and I runthis node Then, I run again thescore node at the diagram, as the results are below. The event classification table at the Decision node showsthe following results: FN (70), TN (70), FP (0) TP (0) The score node results after applying the decision nodewith prior probabilities show the values: Target “0” 99.6% Target “1” 0.36% These results do not make sense because in the originalpopulation the percent of Target “1” was 5%; I didnt know ho to set the decison tab nor the cost, nor the weight???? Your advice for the best apporach to optimize my prediction model is very appreciated. I look foward to hearing from YOU. Regards Mina
... View more