I am trying to identify common point/trend. For example, if I have 10 million records, of which 200,000 records are considered as "bad" , leaving 800,000 as "good" records. Each ID can have multiple set of records, but once an ID is tagged as "bad", no more records will occur. What I need to find out is if the records prior to going bad had a common theme/pattern.
I've used proc freq in SAS to basically categorize and state that if variable ABC is linked to 100 IDs, and variable DEF is linked to 300 IDs before going bad, then I assume that DEF is the common point. However, the problem I am facing is that if DEF naturally has lot more volume then it's not really considered as the bad link. Is there an algorithm to finding the bad link in SAS?