08-27-2016 04:36 AM
JID=ID(Interval) , x7_1= age group(1=old,0=young) , AY~AB is disease (1=infection,0=uninfection) (x7_1,AY~AB is BINARY)
i want to know the association of variables which is disease name using the hpbnet of EM.
can I know a network in diseases using bayesian network node?
But I know nothing from seting data(ex. what is the TARGET variable?) and what should I choose which bayesian network.
SO, what I want is first, which bayesian network (NAIVE, BAN, TAN) shoud I use?
secend, if I wanna know network in disease, how can I set the data?(significance level, network model, maximum parents, number of bins)
09-13-2016 04:57 PM
EM treats variable you specified as TARGET as nominal. It does not care whether the variable is numeric or character type. So you may need to pick a variable that does not have too many categories. Normally you get the idea which variable should be the target from your 'business'. Since your goal, as indicated in your question, appears to find association, not to predict, I would say just use EM to test several different non-interval variables, to see which association finding makes more sense to you.
if you set Automatic Model Selection =YES, EM will select the 'best' network for you. As a starter, it often is sufficient for network selection. Since you are running HPDM GUI, you are entitled to have access to HPDM procedure documentation, in addition to accessing from within the EM product (Help Menu --> Contents). The within-product access is not bad, but it mixes with EM operation instructions.
Once you get access to HPDM procedure document, Examples 5.1 to 5.6 under HPBNET should pretty much answer all you questions, except which one should be your target variable (that really is a business question, not technical question). Hope this helps? Thank you for using SAS.